Finishing off the debugging bus: building a software interface

Today’s post marks the end in a journey discussing what it takes to build a very basic debugging bus to provide you with access to the internals of what’s going on within your FPGA.

Looking at where we came from, we started by outlining what a bus like this might look like. We used Fig 1 to illustrate this.

Fig 2: System Diagram

We then walked through the steps necessary to build such a bus. These steps started with a discussion of what was necessary to build a simple wishbone bus master.

Then we stepped back and looked at how to initially create bus command words from the serial port bytes sent to it.

On the other side of the interface, we showed how to turn the bus response words back into bytes.

This made the basis of the interface. To this, we added interrupts to the interface, so we can tell if something has happened within the device, as well as bus idle indications to give us an assurance that we are connected to the right bus.

We then put the pieces of the bus together into a full blown debugging bus component core.

The debugging bus needed a wishbone bus to command, so we outlined how to build a basic wishbone interconnect that this bus could be connected to. Together, this formed a fully functional Verilog design. The last post then discussed how to build a fully functional Verilator simulation of our new design.

We even tested the design and got some confidence that it worked.

Today, we’re going to build a software interface that can be used to connect to this debugging bus component.

Software Interface Requirements

Some time ago we discussed the interface we wanted to use to access this bus with. Here, we’ll discuss some of the components of making this interface work.

The basic software components we are going to use are shown in Fig 2. In general, this post will focus on the HEXBUS software interface component which is specific to this interface standard. This component, though, fits in place between some other components, and it’s important to understand how these parts and pieces are put together.

Fig 2: Software Components

We’ve already built the FPGA dbgbus interface. Further, bridging such an interface from a serial port onto a TCP/IP stream isn’t that hard–although we may come back and describe it in full later.

The next step up in the stack is a generic low level O/S system interface component calls LLCOMMS. This component provides read() and write() wrappers over the O/S system calls, further abstracting the underlying transport the interface runs over. In particular, the generic interface can be used to abstract whether or not the interface is running over an O/S file descriptor (such as from /dev/ttyUSB0) or a network port (such as localhost:<port>). Both interfaces are found within the LLCOMMS implementation.

This gives us a bit of immunity against changes in the underlying transport when moving from one device to another. For example, the XuLA2 board has a unique USB communication design. Inheriting from the LLCOMMS interface, and replacing the critical portions with a USB interface component will allow us to continue using these dbgbus components, despite the transport having changed to support that unique interface.

On the other side of the HEXBUS interface is the wbregs command line program. This is a basic peek/poke type program which we’ll use here as a demonstration program to show how to use the interface. As a result, the wbregs program doesn’t really exercise the full limits of the debugging bus capability. Still, it makes for a nice example program that can be used to illustrate the bus.

It’s between these various other components that we’ll build our HEXBUS software interface.

The HEXBUS software interface

It’s now time to get into the HEXBUS software interface. As you may remember, this interface needs to match the generic devbus.h, so you might wish to follow along in that file for reference.

With a little simplification, the readio(), readz(), and readi() requests can be mapped into a single internal readv() method. This method handles reading a variable number of values from the interface, given an address and whether or not the address will increment between values. A similar simplification can be applied to the writeio(), writez(), and writei() interface methods. Hence, we’ll spend our time discussing how readv() and writev() work.

The first step to either readv() or writev() is to build a command that first sends the address to the component. This is the purpose of the encode_address() method. Because our interface is as simple as it is, encoding the address is as simple as sprintf() call, writing “A” and then the hexadecimal address to the buffer.

char	*HEXBUS::encode_address(const HEXBUS::BUSW a) {
	// We'll write the address onto an internal command buffer, and then
	// track a pointer to the end of the buffer
	char *ptr = m_buf;
	*ptr++ = HEXB_ADDR;	// Place an "A" at the beginning of our buffer
	sprintf(ptr, "%x", a);	// encode the value in hex
	ptr += strlen(ptr);	// Adjust ptr to point to the end of the addr
	return ptr;		// cmd, and return ptr
}

encode_address() returns a pointer to the end of the buffer, so that the read or write command may be appended to it.

This is the basics of the encode_address() command. Some additional adjustments may also be found in the HEXBUS interface source, so that we can avoid setting an address twice if we don’t need to.

Both readv() and writev() call encode_address() as their basic first step:

// encode_address stores its results in a local buffer, m_buf
// Remember the two LSB's are command indications that are not part of the
// address, and that setting the LSB will keep the address from
// incrementing between bus operations.  Hence, we examine whether or not
// we are incrementing, and adjust accordingly
ptr = encode_adress(a | ((inc)?0:1));

After this, readv() and writev() diverge. We’ll focus on readv() first, and come back to writev() later.

Within readv(), the command to read a value from the interface is a simple “R”, terminated with a newline. We’ll append this request to the address command, and pass it to our lower level communications interface.

// encode_address stores its results in a local buffer, m_buf
ptr = encode_adress(a | ((inc)?0:1));
while(nread < len) {
	*ptr++ = HEXB_READ;
	*ptr++ = '\n';
	*ptr   = '\0';

	// Write this buffer to the lower level comms port
	m_dev->write(m_buf, (ptr-m_buf));

	// Read a word from the interface
	buf[nread++] = readword();

	// Clear the command buffer so we can start over
	ptr = m_buf
}

The loop operates over all of the requested words. Within the loop, readv() calls readword() to read individual words from the interface, and write them into the given buffer. This is then repeated until all of the words have been read.

The writev() code, on the other hand, sends a “W” followed by the hexadecimal value which we wish to write. That request is completed with a newline.

int	nw = 0;

while(nw < len) {
	// Append the write command
	*ptr++ = 'W';
	*ptr = '\0';
	// If the value isn't zero, append it too.  End the command with
	// a newline
	// (The hardware assumes a value of zero if it isn't given
	if (m_buf[nw] != 0) {
		sprintf(ptr, "%x\n", buf[nw])
		ptr += strlen(ptr);
	} else {
		*ptr++ = '\n';
		*ptr = '\0';
	}

	m_dev->write(m_buf, ptr-m_buf);

	readidle();

	nw++;
	ptr = m_buf;
}

Once the actual write command is sent to the lower level, we call readidle() to read (and ignore) any acknowledgements. This keeps the interface synchronous–which is especially important given that there are no FIFO’s yet built within it.

We nown turn our attention to readword() and readidle(). The structure of the two of these is quite similar. Here, we’ll start with looking at how readword() is composed.

Since multiple words can be sent across the interface using the same command word, we need a state variable to capture and remember the last command word. We’ll use m_cmd for that purpose. It is retained from response to response, so that it need not be repeated.

int		nr;
unsigned	word, result;
bool		done = false;

// Initialize our response register to zero
word = 0;
do {
	// Read a character from the interface, block if necessary
	do {
		nr = lclreadcode(&m_buf[0], 1);
	} while (nr < 1);

	// If the character is a lower case hexadecimal digit, shift our
	// word by four bits and set the lower four bits with this
	// value.
	if (isdigit(m_buf[0]))
		word = (word << 4) | (m_buf[0] & 0x0f);
	else if ((m_buf[0] >= 'a')&&(m_buf[0] <= 'f'))
		word = (word << 4) | ((m_buf[0] - 'a' + 10)&0x0f);
	else {
		// Otherwise, if the word we've received is an out-of-band
		// character.  Such characters always end the current response.
		// They can also be used to start a new response.  Here, we'll
		// focus on the last command first.  That's what's in m_cmd,
		// and also what's in our "word" buffer
		if (m_cmd == HEXB_READ) {
			// On any read value, update the address pointer
			if (m_inc)
				m_lastaddr += 4;

			// We've found the word we need, so we are done
			done = true;

			// Copy the result into ... result
			result = word;
		} else if (m_cmd == HEXB_ACK) {
			// On a write acknowledgement (might be left over
			// unread from a previously unfinished transaction?)
			// adjust the address pointer, and increment the number
			// of acknowledgements received.

			if (m_inc)
				m_lastaddr += 4;
			m_nacks++;
		} else if (m_cmd == HEXB_INT) {
			// On an interrupt notification, just set the interrupt
			// flag.
			m_interrupt_flag = true;
		} else if (m_cmd == HEXB_ERR) {
			// On a bus error, set the bus error flag, and
			// throw a bus error in case anything wishes to catch
			// and process it.

			m_bus_err = true;
			throw BUSERR(m_lastaddr);
		} else if (m_cmd == HEXB_IDLE) {
			// If we get multiple idles while waiting for a
			// response, then our response has been lost.
			// Throw a bus error

			abort_countdown--;
			if (0 == abort_countdown)
				throw BUSERR(0);
		} else if (m_cmd == HEXB_ADDR) {
			// On an address response from the debugging bus,
			// set the address tracking reads to the value
			// returned, as well as the m_inc to whether or not
			// this indicates an incrementing address.
			m_addr_set  = true;
			m_inc       = (word & 1) ? 0:1;
			m_lastaddr  = word & -4;
		} else if (m_cmd == HEXB_RESET) {
			// On any bus reset, any address that was set is now
			// invalid.  Mark it so.
			m_addr_set = false;
		}

		// Any out of band character other than a newline is a
		// new command that we start
		if (!isspace(m_buf[0]))
			m_cmd = m_buf[0];

		// Clear the register so we can receive the next word	
		word = 0;
	}
} while(!done);

// Return any recovered/read result
return result;

readword() reads from the port one character at a time, by calling lclreadcode(). lclreadcode() reads a single byte from the interface, tossing out any device-not-ready bytes (0x7f or 0xff). After that, readword() is very similar to our bytes to words component within the HEXBUS RTL design. Any command word (“R”, “W”, “A”, etc) coming in is stored. After that, words are built from assembling the hexadecimal values together. As soon as a non-hexadecimal character is received, the word received is complete.

The big difference between this logic and the logic of readidle() below is that readidle() only loops while data is available to be read, and it doesn’t return on any words read–since readidle() is called after writing (not reading) to the interface.

// Start by clearing the register
word = 0;

// Repeat as long as there are values to be read
while(m_dev->available()) {
	// Read one character from the interface
	nr = lclreadcode(&m_buf[0], 1);

	// If it's a hexadecimal digit, adjust our word register
	if (isdigit(m_buf[0]))
		word = (word << 4) | (m_buf[0] & 0x0f);
	else if ((m_buf[0] >= 'a')&&(m_buf[0] <= 'f'))
		word = (word << 4) | ((m_buf[0] - 'a' + 10)&0x0f);
	else {
		// Any thing else identifies the beginning (or end)
		// of a response word.  Deal with it based upon the
		// last response m_cmd received.
		if (m_cmd == HEXB_ADDR) {
			// Received an address word
			m_addr_set  = true;
			m_inc       = (word & 1) ? 0:1;
			m_lastaddr  = word & -4;
		} else if (m_cmd == HEXB_READ) {
			// Read data ... doesn't make sense in this
			// context, so we'll just ignore it
			if (m_inc)
				m_lastaddr += 4;
		} else if (m_cmd == HEXB_INT) {
			// On an interrupt, just set the flag to note
			// we've received one.
			m_interrupt_flag = true;
		} else if (m_cmd == HEXB_ACK) {
			// Write acknowledgement.  writev() will check
			// whether the correct number of
			// acknoweledgments has been received before
			// moving on.  Read and note it here.
			if (m_inc)
				m_lastaddr += 4;
			m_nacks++;
		} else if (m_cmd == HEXB_ERR) {
			// On an err, throw a BUSERR exception
			m_bus_err = true;
			throw BUSERR(m_lastaddr);
		} else if (m_cmd == HEXB_RESET) {
			// On any reset, clear the address set flag
			// and any unacknowledged bus error condition
			m_addr_set = false;
			m_bus_err = false;
		}

		// Any out of band character other than a whitespace
		// is a new command starting--keep track of which
		// command it is.
		if (!isspace(m_buf[0]))
			m_cmd = m_buf[0];
		word = 0;
	}
}

While there are other details within the software interface, such as a means of creating a log file that can be used to find interface errors, or a means of querying whether or not an interface has taken place, we’ll gloss over these in favor of simplifying our description today. Feel free to browse the software and see how they work.

Register Naming

There’s another piece to our software which isn’t shown in Fig 2. This portion defines names, and then provides a register name to address translation. It starts by defining constant values for all of our register addresses.

#define	R_VERSION       0x00002040
#define	R_SOMETHING	0x00002044
#define	R_BUSERR       	0x00002048
#define	R_PWRCOUNT	0x0000204c
#define	R_INT		0x00002050
#define	R_HALT		0x00002054

#define	R_SCOPE		0x00002080
#define	R_SCOPD		0x00002084

#define	R_MEM		0x00004000

This will make it easy to say within your software something like,

variable = m_fpga->readio_R_VERSION);

as an example.

To facilitate the command line register usage of tools like wbregs, we create a user name to constant mapping function. This is nothing more than a table of register address, register name pairs, as is shown below:

const	REGNAME	raw_bregs[] = {
	{ R_VERSION       ,	"VERSION" 	},
	{ R_BUSERR        ,	"BUSERR"  	},
	{ R_PWRCOUNT      ,	"PWRCOUNT"	},
	{ R_INT		  ,	"int"		},
	{ R_HALT	  ,	"halt"		},
	{ R_SCOPE         ,	"SCOPE"   	},
	{ R_SCOPD         ,	"SCOPD"   	},
	{ R_MEM           ,	"RAM"     	}
};

A short routine also decodes the register names and turns them into addresses, which can then be used with readio() or writeio();

From a software standpoint, this approach to register naming is very important. Because registers are given names, whether the C++ name R_VERSION or more human readable and case insensitive name “VERSION”, any software using these names doesn’t need to be changed from one design to the next, nor when the addresses change. As examples, I can use the same software to control the ZipCPU, whether to load a program into memory or debug a running program, on one board as I can on other boards. Any changed addresses are taken care of by recompiling the software for the new board.

Testing it all out

If you are just joining the discussion at this point, you’ll want to have Verilator, g++, gtkwave, and git installed to test the interface out yourself. Although many of these are Linux programs (I use Ubuntu myself), I have instructions available for doing this on a Windows platform using Cygwin. (Those instructions are available to anyone who wants to test them and let me know how well they do (or don’t) work–just send me an e-mail asking for them, and promising to tell me if they work for you.) Then, once these utilities are installed, you should be able to just download and build the Verilator, simulation of the debugging bus we’ve been working on.

Cloning and building the project should be quite straightforward:

cd ~/your/chosen/project/path
git clone https://github.com/ZipCPU/dbgbus
cd dbgbus
make

Now that this is built, there are two steps to interacting with the simulation. The first step is to run the simulation executable.

cd bench/cpp
./testbus_tb

Have a little caution when running this. If the trace file generation is turned on (look for opentrace(“trace.vcd”) in testbus_tb.cpp), it may quickly write a VERY LARGE file to your computer. That file will be called trace.vcd, and it will be written in the same directory as the testbus_tb program is called from. If this is a problem, feel free to comment out the trace generation line and run without generating a trace.

The second step, which you may wish to do in another terminal window, is to run the wbregs program to interact with this simulation. I like to start any testing session by just proving that I can read the internal version number from the simulation:

cd ~/your/chosen/project/path/dbgbus/sw
./wbregs VERSION

wbregs should return an 8-digit hexadecimal number looking like a date. The current date within the repository is 0x20170622, but I may change that later to indicate changes to the repository.

We also placed a counter internal to the simulation. Using this counter, we can query how many ticks have passed since the simulation started.

./wbregs PWRCOUNT

And again.

./wbregs PWRCOUNT

You should get two different answers, and the number should increase between the two.

We can also check the status of the onboard wishbone scope, by reading from its control register.

./wbregs SCOPE

Look at the top nibble of the return word in response. It’s a ‘1’. That means that the scope now has enough samples to fill its memory, but that it has yet to be triggered. We’ll come back to this in a moment.

Let’s turn our attention to the block RAM memory. Using wbregs we can write to the first location in memory:

./wbregs MEM 0xdeadbeef

While we can read from and write to other locations in memory as well,

./wbregs 0x4008 0xdeadbeef

only the first memory location has a name, MEM, associated with it. To access other locations, you will need to give wbregs the address in numeric (strtoul) format.

If you recall, we built our design, though, in such a way that any read or write command to memory would trigger the scope.

Let’s see if it got triggered:

./wbregs SCOPE

Here, the scope returns a response having 7 in its high order nibble. This means that the scope has not only been triggered, but it’s also stopped recording. At this point, in your software code, you could issue a:

ctrl_register = m_fpga->readio(R_SCOPE);
scopelen = (1<<((ctrl_register >> 20)&0x1f));
buffer = new uint32_t[scopelen];

m_fpga->readz(R_SCOPD, scopelen, buffer);

This would read the state from the scope control register. Specifically, though, this would examine the scope control register to find out how much internal RAM the scope has been built with. Once determined, a buffer can be built to read from the scope, and the read command can be issued. All of this is simplified by the scope helper class, but we’ll save that for another lesson for another day.

What can you use this for

This ends our series in how to build a bus that you can use for debugging an FPGA. While it’s taken a while to get here, interacting with an FPGA in this manner can be particularly valuable. While we’ve focused on the use case of being able to get scope information out of the design, many other use cases exist.

Reading/writing video memory within the FPGA, such as reading the results from a camera
Reading, erasing, and programming the QSPI flash memory within the FPGA
Setting up a fallback design, using the internal configuration access port, or even switching FPGA configurations without using the official JTAG port.
Grabbing data from a GPS receiver
Controlling an OLEDrgb, or even a 2-line LCD, to make sure your controller works before trying to use a CPU to run it.
Setting up the memory, either block RAM or SDRAM, within your design so that it can be processed later.

Indeed, the possibilities are so numerous, it’s hard to list them all here.

There is one thing, though, that this interface lacks: speed. Speed can be achieved by packing more bits per word than four, and by compressing this interface. Both of these capabilities are part of the WBUBUS I normally use. You are more than welcome to use this interface if you would like, subject only to the conditions of the GPL.

That’s why I intend to to use this design and specifically the devbus, interface it implements as a basis for moving forward with future articles on this blog.

Thoughts or comments? Please feel free to share them below.

Thanks!