About the ZipCPU
The ZipCPU is an open source, fully-functional, soft core CPU built for FPGA environments by Gisselquist Technology, LLC.
The ZipCPU was initially designed with the sole purpose of creating a simple CPU within an FPGA, and particularly one that was powerful enough to run Linux.
In bullets, the ZipCPU is:
-
A 32-bit CPU. All registers are 32-bits wide, addresses are 32-bits wide, instructions are 32-bits wide, etc.
-
A RISC CPU. It implements only a minimum set of instructions, a much smaller set than most other “RISC” CPU’s.
-
A Load/Store architecture. Only load and store instructions can access memory
-
Wishbone compliant. All peripherals are accessed via memory mapped I/O across a common Wishbone bus.
-
A Von-Neumann architecture. Both instructions and data share a common bus
-
A pipelined architecture, having stages for prefetch, decode, read-operand, execute, and write-back. The execute stage is implemented by one of four blocks: an arithmetic logic unit (ALU), a memory unit, a divide coprocessor, and a (not yet implemented) floating point coprocessor. The last two blocks, the divide and floating point coprocessor, will only ever be options to the CPU. Their inclusion will be chosen based upon your implementation needs.
-
A configurable CPU: You can choose how much logic goes into the CPU, and therefore trade LUTs for speed if desired.
Low Cost
The ZipCPU was designed to be implemented on a low cost FPGA board. Reasons for this include:
-
I couldn’t afford the FPGA board that I really wanted, the VC707, much less the license I would need to program it. Instead, I could afford the much smaller boards ($50-$150), such as the ones Digilent sells.
-
I wanted to know how to build a CPU. This includes learning how to build not only a CPU, but also backends for the C-library, GCC and binutils (GDB soon to come).
-
Prior to purchasing any boards or licenses, I simulated my designs using Verilator. However, Verilator only works with Verilog source, not encrypted proprietary IP components. Hence, when I wanted to simulate an FFT with neither hardware nor proprietary IP, I was forced to build my own. The same became true for the ZipCPU. Incidentally, the “simulate before you buy” technique worked so well for my first board, that I had my initial design running within two days after I received the board.
-
I’m still hoping to integrate an optional MMU into the CPU, and with it to run Linux, but this feature remains sometime in the future.
Unique Features of the ZipCPU
While the ZipCPU was the result of my own desire to learn how CPU’s operate, now that it has been built it solves many of the problems that many of the more proprietary CPU’s struggle with.
-
Because the ZipCPU is completely open source, opensource tools may be used to simulate and run the CPU–even without an FPGA.
o Your simulation CPU’s power is the limit of this simulation. As an example, you could, if you wished to, run the CMod-S6 simulation all the way from power up through several rounds of 4x4x4 Tic-Tac-Toe. (Just don’t run it in debug mode all night, at the peril of filling up your disk drive.)
o In a similar fashion, the ZipCPU is not tied to Altera, Xilinx, nor its more recent port to the Lattice iCE40 FPGA’s.
o Hence, it offers more flexibility than either MicroBlaze or Nios2.
-
Because the ZipCPU was designed to be simple, it can be a component on a very small FPGA, such as the Spartan 6LX4 used in the Digilent’s CMod S6.
-
Since the ZipCPU was designed around the pipelined Wishbone bus, found within the Wishbone B4 specification, the ZipCPU enjoys memory accesses that are between three and thirty times faster than the OpenRISC core. (Their cache implementation is still better than mine, though …) Further, because of the many, many options and channels required in order to implement the AXI bus used by Xilinx’s IP, the Wishbone is simpler and hence both easier to work with, and it requires less logic to use.
-
Because Gisselquist Technology, LLC, owns all of the code for the ZipCPU and its peripherals, proprietary licenses may be purchased. This sets the ZipCPU apart from the other OpenSource soft core CPUs, such as OpenRISC, whose IP may not be owned by any single entity with whom one might negotiate a purchase.
You can find out many of the details of this CPU within the ZipCPU repository on GitHub. There, you will find the specification for the CPU which contains not only the obligatory description of its instruction set, but also examples of how to program with it as well as my ongoing “honest assessment” of it as a CPU.
Example ZipCPU designs
In many ways, the ZipCPU is just that: a CPU and only a CPU. It has a connection to its memory and peripherals, but these are not a part of the ZipCPU itself.
However, Gisselquist Technology has publicly released several designs that use the ZipCPU, all available on GitHub for you to examine. These include the S6SoC project, which fits on the Xilinx Spartan-6LX4 found within a Digilent CMod S6, the OpenArty which fits on a Digilent Arty, or the xulalx25soc which fits on Xess.com’s XuLA2-LX25 board. Upon customer request, the xulalx25soc now has a build option which can be used to build a version for the Spartan-6LX9 found on the XuLA2-LX9 board which Xess.com used to sell.
Another ZipCPU design you may wish to look at is the basic ZipCPU design called zbasic. This design has support for a flash, block RAM, and a serial port. It’s also my testing grounds for getting the SD Card controller to work.
Current Status
The ZipCPU has undergone several instruction set revisions, going from a four bit opcode supporting only 32-bit bytes, to a five bit opcode, and finally to a 5-bit opcode with support for 16-bit compressed instructions and 8-bit bytes. I see no reason at this time to adjust the instruction set any more.
The current instruction set has newlib, GCC, and binutils support–although the soft floating point emulation support is lagging a touch.
A minimal O/S exists for the ZipCPU. Further O/S support is expected, but lagging behind getting more peripheral support.
Performance
Here’s a summary of the clock rates the ZipCPU can achieve on a variety of commercial boards:
Design | Clock Rate | CPI | Notes |
---|---|---|---|
Ico-Board | 40 MHz | ? | This icoboard design is still in development |
S6SoC | 80 MHz | 18 | When running from flash |
XuLA2 | 80 MHz | 1 | |
Basys-3 | 100 MHz | 1 | |
Arty | 82 MHz | 1 | Clock speed limited by SDRAM clock |
Nexys-Video | 82 MHz | 1 |
The XuLA design was used in the summer of 2016 to benchmark the ZipCPU using the Dhrystone benchmark. At the time, the ZipCPU only supported 32-bit bytes, so it wasn’t a proper fit for the Dhrystone benchmark. Still, it was able to accomplish 0.744 DMIPS/MHz packing one character in each 32-bit byte. If you instead packed four characters into the 32-bit bytes used by the ZipCPU at the time, the CPU could achieve 0.95 (modified) DMIPS/MHz. These results were presented at ORCONF, 2016.
Since then, the CPU has been modified to support 8-bit bytes. I have not returned to the Dhrystone benchmark, though, to update its performance measure.
As of ORCONF, 2016, the ZipCPU used between 1286 and 4926 6-LUTs, depending upon how it is configured. This number is out of date, though, for the same reason that the Dhrystone benchmark measure is out of date: the ZipCPU has since been modified to support 8-bit bytes. I can say, though, that I was able to pack more logic into the CMod-S6 as a result than I could pack into it before, suggesting that the new and improved CPU uses even fewer resources.