Today’s post marks the end in a journey discussing what it takes to build a very basic debugging bus to provide you with access to the internals of what’s going on within your FPGA.
Looking at where we came from, we started by outlining what a bus like this might look like. We used Fig 1 to illustrate this.
We then walked through the steps necessary to build such a bus. These steps started with a discussion of what was necessary to build a simple wishbone bus master.
Then we stepped back and looked at how to initially create bus command words from the serial port bytes sent to it.
On the other side of the interface, we showed how to turn the bus response words back into bytes.
This made the basis of the interface. To this, we added interrupts to the interface, so we can tell if something has happened within the device, as well as bus idle indications to give us an assurance that we are connected to the right bus.
We then put the pieces of the bus together into a full blown debugging bus component core.
The debugging bus needed a wishbone bus to command, so we outlined how to build a basic wishbone interconnect that this bus could be connected to. Together, this formed a fully functional Verilog design. The last post then discussed how to build a fully functional Verilator simulation of our new design.
We even tested the design and got some confidence that it worked.
Software Interface Requirements
The basic software components we are going to use are shown in Fig 2. In general, this post will focus on the HEXBUS software interface component which is specific to this interface standard. This component, though, fits in place between some other components, and it’s important to understand how these parts and pieces are put together.
The next step up in the stack is a generic low level O/S system interface component calls LLCOMMS. This component provides read() and write() wrappers over the O/S system calls, further abstracting the underlying transport the interface runs over. In particular, the generic interface can be used to abstract whether or not the interface is running over an O/S file descriptor (such as from /dev/ttyUSB0) or a network port (such as localhost:<port>). Both interfaces are found within the LLCOMMS implementation.
This gives us a bit of immunity against changes in the underlying transport when moving from one device to another. For example, the XuLA2 board has a unique USB communication design. Inheriting from the LLCOMMS interface, and replacing the critical portions with a USB interface component will allow us to continue using these dbgbus components, despite the transport having changed to support that unique interface.
On the other side of the HEXBUS interface is the wbregs command line program. This is a basic peek/poke type program which we’ll use here as a demonstration program to show how to use the interface. As a result, the wbregs program doesn’t really exercise the full limits of the debugging bus capability. Still, it makes for a nice example program that can be used to illustrate the bus.
It’s between these various other components that we’ll build our HEXBUS software interface.
The HEXBUS software interface
With a little simplification, the readio(), readz(), and readi() requests can be mapped into a single internal readv() method. This method handles reading a variable number of values from the interface, given an address and whether or not the address will increment between values. A similar simplification can be applied to the writeio(), writez(), and writei() interface methods. Hence, we’ll spend our time discussing how readv() and writev() work.
The first step to either readv() or writev() is to build a command that first sends the address to the component. This is the purpose of the encode_address() method. Because our interface is as simple as it is, encoding the address is as simple as sprintf() call, writing “A” and then the hexadecimal address to the buffer.
encode_address() returns a pointer to the end of the buffer, so that the read or write command may be appended to it.
This is the basics of the encode_address() command. Some additional adjustments may also be found in the HEXBUS interface source, so that we can avoid setting an address twice if we don’t need to.
Both readv() and writev() call encode_address() as their basic first step:
After this, readv() and writev() diverge. We’ll focus on readv() first, and come back to writev() later.
Within readv(), the command to read a value from the interface is a simple “R”, terminated with a newline. We’ll append this request to the address command, and pass it to our lower level communications interface.
The loop operates over all of the requested words. Within the loop, readv() calls readword() to read individual words from the interface, and write them into the given buffer. This is then repeated until all of the words have been read.
The writev() code, on the other hand, sends a “W” followed by the hexadecimal value which we wish to write. That request is completed with a newline.
Once the actual write command is sent to the lower level, we call readidle() to read (and ignore) any acknowledgements. This keeps the interface synchronous–which is especially important given that there are no FIFO’s yet built within it.
We nown turn our attention to readword() and readidle(). The structure of the two of these is quite similar. Here, we’ll start with looking at how readword() is composed.
Since multiple words can be sent across the interface using the same command word, we need a state variable to capture and remember the last command word. We’ll use m_cmd for that purpose. It is retained from response to response, so that it need not be repeated.
readword() reads from the port one character at a time, by calling lclreadcode(). lclreadcode() reads a single byte from the interface, tossing out any device-not-ready bytes (0x7f or 0xff). After that, readword() is very similar to our bytes to words component within the HEXBUS RTL design. Any command word (“R”, “W”, “A”, etc) coming in is stored. After that, words are built from assembling the hexadecimal values together. As soon as a non-hexadecimal character is received, the word received is complete.
The big difference between this logic and the logic of readidle() below is that readidle() only loops while data is available to be read, and it doesn’t return on any words read–since readidle() is called after writing (not reading) to the interface.
While there are other details within the software interface, such as a means of creating a log file that can be used to find interface errors, or a means of querying whether or not an interface has taken place, we’ll gloss over these in favor of simplifying our description today. Feel free to browse the software and see how they work.
There’s another piece to our software which isn’t shown in Fig 2. This portion defines names, and then provides a register name to address translation. It starts by defining constant values for all of our register addresses.
This will make it easy to say within your software something like,
as an example.
To facilitate the command line register usage of tools like wbregs, we create a user name to constant mapping function. This is nothing more than a table of register address, register name pairs, as is shown below:
A short routine also decodes the register names and turns them into addresses, which can then be used with readio() or writeio();
From a software standpoint, this approach to register naming is very important. Because registers are given names, whether the C++ name R_VERSION or more human readable and case insensitive name “VERSION”, any software using these names doesn’t need to be changed from one design to the next, nor when the addresses change. As examples, I can use the same software to control the ZipCPU, whether to load a program into memory or debug a running program, on one board as I can on other boards. Any changed addresses are taken care of by recompiling the software for the new board.
Testing it all out
If you are just joining the discussion at this point, you’ll want to have Verilator, g++, gtkwave, and git installed to test the interface out yourself. Although many of these are Linux programs (I use Ubuntu myself), I have instructions available for doing this on a Windows platform using Cygwin. (Those instructions are available to anyone who wants to test them and let me know how well they do (or don’t) work–just send me an e-mail asking for them, and promising to tell me if they work for you.) Then, once these utilities are installed, you should be able to just download and build the Verilator, simulation of the debugging bus we’ve been working on.
Cloning and building the project should be quite straightforward:
Now that this is built, there are two steps to interacting with the simulation. The first step is to run the simulation executable.
Have a little caution when running this. If the trace file generation is turned on (look for opentrace(“trace.vcd”) in testbus_tb.cpp), it may quickly write a VERY LARGE file to your computer. That file will be called trace.vcd, and it will be written in the same directory as the testbus_tb program is called from. If this is a problem, feel free to comment out the trace generation line and run without generating a trace.
The second step, which you may wish to do in another terminal window, is to run the wbregs program to interact with this simulation. I like to start any testing session by just proving that I can read the internal version number from the simulation:
wbregs should return an 8-digit hexadecimal number looking like a date. The current date within the repository is 0x20170622, but I may change that later to indicate changes to the repository.
We also placed a counter internal to the simulation. Using this counter, we can query how many ticks have passed since the simulation started.
You should get two different answers, and the number should increase between the two.
We can also check the status of the onboard wishbone scope, by reading from its control register.
Look at the top nibble of the return word in response. It’s a ‘1’. That means that the scope now has enough samples to fill its memory, but that it has yet to be triggered. We’ll come back to this in a moment.
Let’s turn our attention to the block RAM memory. Using wbregs we can write to the first location in memory:
While we can read from and write to other locations in memory as well,
only the first memory location has a name, MEM, associated with it. To access other locations, you will need to give wbregs the address in numeric (strtoul) format.
If you recall, we built our design, though, in such a way that any read or write command to memory would trigger the scope.
Let’s see if it got triggered:
Here, the scope returns a response having 7 in its high order nibble. This means that the scope has not only been triggered, but it’s also stopped recording. At this point, in your software code, you could issue a:
This would read the state from the scope control register. Specifically, though, this would examine the scope control register to find out how much internal RAM the scope has been built with. Once determined, a buffer can be built to read from the scope, and the read command can be issued. All of this is simplified by the scope helper class, but we’ll save that for another lesson for another day.
What can you use this for
This ends our series in how to build a bus that you can use for debugging an FPGA. While it’s taken a while to get here, interacting with an FPGA in this manner can be particularly valuable. While we’ve focused on the use case of being able to get scope information out of the design, many other use cases exist.
Reading/writing video memory within the FPGA, such as reading the results from a camera
Reading, erasing, and programming the QSPI flash memory within the FPGA
Setting up a fallback design, using the internal configuration access port, or even switching FPGA configurations without using the official JTAG port.
Grabbing data from a GPS receiver
Controlling an OLEDrgb, or even a 2-line LCD, to make sure your controller works before trying to use a CPU to run it.
Setting up the memory, either block RAM or SDRAM, within your design so that it can be processed later.
Indeed, the possibilities are so numerous, it’s hard to list them all here.
There is one thing, though, that this interface lacks: speed. Speed can be achieved by packing more bits per word than four, and by compressing this interface. Both of these capabilities are part of the WBUBUS I normally use. You are more than welcome to use this interface if you would like, subject only to the conditions of the GPL.
That’s why I intend to to use this design and specifically the devbus, interface it implements as a basis for moving forward with future articles on this blog.
Thoughts or comments? Please feel free to share them below.
For which of you, intending to build a tower, sitteth not down first, and counteth the cost, whether he have sufficient to finish it? (Luke 14:28)