Want to build a scope? Seriously! Want to turn your FPGA into a scope that can measure anything internal to your logic, and then make that information available to you upon request? Even better, you could use this scope to capture samples from an external analog to digital converter if you wanted to.

It’s time to do build the Verilog to do it!

Fig 1: WB-UART Overview
Block Diagram of a Simpler Wishbone to UART converter

Ok, here’s where we are at in this process: We now have all of the components necessary to build a debugging bus interface. We’ve now built all of the components outlined in Fig 1 to the right.

Today, we’re going to build something that this interface can interact with. When we’re done, the result won’t (yet) look like something a professional tool might produce, but it’ll be enough for a demonstration. We’ll save the professional looking part for another lesson.

Our Components

If we want to build a bus that connects things together, the first step is going to be collecting the components together that we want to interface with. So, I’ve pulled several components from other designs that I have. I’ve also adjusted their copyright so that, as part of this project, you can have access to these components under the LGPL. These components include:

  1. A block RAM Wishbone memory slave drawn from the zbasic ZipCPU repository.

  2. A wishbone scope. drawn from the wbscope repository.

  3. A simplified UART transmitter and receiver, drawn from the wbaurt32 repository.

As a legal note, the RTL designs these examples were taken from still remain firmly under the GPL, only these specific component files have been given a new copyright.

Fig 2: System Diagram
An example WB interconnect

Of course, we’ll also be using the hexbus debugging interface we’ve just developed and presented as part of this blog.

The basic design is going to be similar to Fig 1, but with the new components added in as part of the bus, as shown in Fig 2.

Of these new components, the only one we haven’t discussed is a simple ad hoc component that we’ll discuss below, and that you can use to get access to arbitrary values within your design, and so that you have a little bit of an example of some of the things that can be done.

Connecting the components

There are three parts to connecting components to a Wishbone bus. First, you must decode the components’ address, so that only the proper component is addressed. Second, you must merge the three basic Wishbone slave outputs back into one response to be returned to the master. These include the _ack line, the _stall line, and the return _data line. Third, although not required, painful prior errors have taught me to always create a wb_err return line, and to set that any time a non-existent component is addressed. Further, along those same lines, we’re going to make sure that no components occupy the NULL address.

Wiring up the bus master

Prior to the first step, though, we’ll need to wire up our UART receiver, our debugging bus Wishbone master, and our UART transmitter.

rxuartlite #(UARTSETUP) rxtransport(i_clk,
				i_uart, rx_stb, rx_data);

hbbus	genbus(i_clk,
	// The receive transport wires
	rx_stb, rx_data,
	// The bus control output wires
	wb_cyc, wb_stb, wb_we, wb_addr, wb_odata, wb_sel,
	//	The return bus wires
	  wb_ack, wb_stall, wb_err, wb_idata,
	// An interrupt line
	bus_interrupt,
	// The return transport wires
	tx_stb, tx_data, tx_busy);

txuartlite #(UARTSETUP) txtransport(i_clk,
				tx_stb, tx_data, o_uart, tx_busy);

This is primarily an exercise in wire management: the outputs of the receiver go into the hexbus, and the outputs from the hexbus decoder go into the transmitter and control the wishbone bus. Still, one particular parameter needs some attention: the UARTSETUP. This parameter is defined within the wbuart32 project. If we want to communicate using 8-bits per baud, no parity, and one stop bit, (8N1) then we can set this parameter simply to the number of clocks per baud. Hence, if we want to run our interface at 4MBaud with a 100MHz clock, we should set this to (100MHz/4MBaud) or 25.

The address select lines

Next, let’s handle our address select lines. We’ll support three basic components, and we’ll use the prefixes of smpl_ (for our ad-hoc registers), mem_ (for our block RAM) and scop_ to describe them. Handling address selection is done in two parts. For the first part, we just test whether or not the Wishbone address matches the address we’ve given to this component:

// Nothing should be assigned to the null page
assign	smpl_sel = (wb_addr[29:4] == 26'h081);	// 0x00002040
assign	scop_sel = (wb_addr[29:4] == 26'h082);	// 0x00002080
assign	mem_sel  = (wb_addr[29:12] ==18'h1);	// 0x00004000

One caution is in order: our bus address lines reference 32-bit words, not octets. Address 0x810 above references one 32-bit word, while address 0x811 references another 32-bit word. Most people are more familiar with accessing a bus where the address is in units of octets. For this reason, we’ve written out the octet equivalent of each address in the comment to the right. This equivalent is given by shifting the address up by two, as well as by the number of unspecified bits (4 or 12) in the address.

This is actually a good time to point out that there’s really a lot more work to be done to do address assignment properly than just these simple decode lines above. A specification document needs to be written outlining what addresses are being used for what, the addresses need to be turned into C/C++ address references for the peek/poke by name interface and the CPU, and more. Here, we’re just going to wave our hands and assign these three address groups to peripherals.

We’ll probably have to come back and fix this lack in the near future.

Bus Errors

I usually define a Wishbone bus error as one of three things. First, it is an error if nothing is selected during a Wishbone operation. Second, it is an error if more than one thing is ever selected. Finally, it is an error if more than one acknowledgement is returned on any given clock. For our example purposes here, we’ll only set the error is nothing is selected. That is, if the wb_stb signal is high indicating a Wishbone request, and yet the address in wb_addr doesn’t reference any of our components, then a bus error should be returned.

// This will be true if nothing is selected
assign	none_sel = (!smpl_sel)&&(!scop_sel)&&(!mem_sel);

// The wishbone error signal is true for one clock only, and then it
// resets itself
always @(posedge i_clk)
	wb_err <= (wb_stb)&&(none_sel);

The Wishbone spec, though, isn’t as particular regarding what constitutes a bus error, and many masters want any error detected to be aligned with where the acknowledgement would’ve come back–so that every request ends in either an error or an acknowledgement. That approach allows both slaves and the interconnect to generate errors. The interconnect we are building today, though is simpler, and doesn’t do that.

We’ll do one more thing with the bus error: we’ll grab a copy of any bus error address, so we can report back later the bus address associated with any error (if necessary):

always @(posedge i_clk)
	if (wb_err)
		bus_err_address <= wb_addr;

We’ll come back to logic required of the interconnect later, once we handle the slave produced signals.

Slave response: The stall line

The first slave response logic we’ll look at is the stall logic. The Wishbone spec, recommends that this logic not be clocked, and that it be only combinatorial in nature. In particular, you’ll want to stall the bus any time you are trying to make a request of a component whose stall line is high.

assign	wb_stall = ((smpl_sel)&&(smpl_stall))
		||((scop_sel)&&(scop_stall))
		||((mem_sel)&&(mem_stall));

For this particular bus implementation, the stall lines are just a formality. None of these Wishbone slave’s will ever stall the bus. This line is therefore ripe for being removed by the optimizer within your toolflow. Here, we keep it in case we need to add components later that might stall the bus.

Slave response: The Acknowledgement

The second slave response line is the acknowledgement line. This is the line that the slave uses to indicate that the data it is providing on its data line is valid. We’ll handle this by creating a clocked line that is simply the or of all the acknowledgement lines.

always @(posedge i_clk)
	wb_ack <= (smpl_ack)||(scop_ack)||(mem_ack);

If we use a clock while assigning our data, the two resulting responses, both acknowledgement and data, will align as required.

Slave response: Return data

The final slave responses are the data lines. These are valid any time the acknowledgement is valid. Indeed, we’ll use the various slave acknowledgement lines to know which slave has produced valid data, and thus to know what data to return to the Wishbone master.

always @(posedge i_clk)
	if (smpl_ack)
		wb_idata <= smpl_data;
	else if (scop_ack)
		wb_idata <= scop_data;
	else if (mem_ack)
		wb_idata <= mem_data;
	else
		wb_idata <= 32'h0;

As a touch of flair, we’ll respond with all zeros if nothing acknowledges our bus read, although this isn’t required and it can be removed if necessary if you are struggling to minimize your logic.

Slave response: Interrupts

Although it’s not really a part of connecting a device to a Wishbone bus, many bus slaves have interrupt lines. We’ll create an interrupt to send back to our debugging bus controller that is simply the or of our two interrupt producing components.

assign	bus_interrupt = (smpl_interrupt) | (scop_int);

Given the way we implemented interrupts within our controller, this will trigger on any positive edge–so it’ll need to be reset prior to being able to trip again. While it’s not necessarily the optimal or the best approach, it may be sufficient for our purposes here.

Connecting the pre-existing components

We have three components to connect our interface to. Two of these components already exist and only need to be referenced from here as sub-modules. These are the block RAM interface, and the wishbone scope.

The block RAM needs very little additional configuration beyond what we’ve already done, but it does need to to be told how big its memory area will be. We’ll create our block RAM to have 2^14 octets, hence the 14 parameter below. We’ll also use the select line, that we set above, mem_sel, to modify the slaves strobe line, so the memory knows that it has been selected–without needing to have any more knowledge of any other peripherals that might be on the bus. This is different from the wb_sel line which we’ve used to determine which octets in a word will be set in any operation.

memdev	#(14) blkram(i_clk,
	wb_cyc, (wb_stb)&&(mem_sel), wb_we, wb_addr[11:0],
		wb_odata, wb_sel,
	mem_ack, mem_stall, mem_data);

The next item we’ll want to place onto our bus is the wishbone scope.

To use the scope, you must decide on what you wish to examine, and then what you want to use to trigger the scope. In our case, let’s trigger off of any Wishbone accesses to our block RAM.

assign	scope_trigger = (mem_sel)&&(wb_stb);

We’ll also select for our scope’s data several of the bus lines. We’ll save for a later date how to turn these wires into a proper VCD file.

assign	debug_data    = { wb_cyc, wb_stb, wb_we, wb_ack, wb_stall,
		wb_addr[5:0], 1'b1,
			wb_odata[9:0],
			wb_idata[9:0] };

The scope has one more capability: it can sample data based upon a “when data is valid” flag. The flag can be really useful if you are processing a signal that isn’t valid on every clock–such as the output of a digitizer as an example. Today, we’ll just set that flag to one so that we can capture on every clock.

assign	scope_ce = 1'b1;

Now that all of the preliminaries have been taken care of, you can now place the scope within our file, and connect it to the bus as well. As with the memory, the biggest part of “hooking it up” is adjusting the strobe line by anding it with the scope select line.

wbscope	thescope(i_clk, scope_ce, scope_trigger, debug_data,
	i_clk, wb_cyc, (wb_stb)&&(scop_sel), wb_we, wb_addr[0],wb_odata,
	scop_ack, scop_stall, scop_data,
	scop_int);

Other things you might notice are the fact that this scope requires two clocks, one for the data and one for the bus. In this example thee two are the same. It also requires the lowest of the address lines. The result is that address 0x02080 will reference the scope control and status register, while address 0x2084 will reference the scope data register.

Building an Ad-Hoc Slave

You will very often find that you need to be able to report some logic result back up the bus to your debug interface–something that is ad-hoc, and not necessarily the part of any well-defined, prebuilt component. In many ways, this seems to be one of the most common requests: how to I get access to (whatever) to see what my design has done? Therefore, let’s make a simple ad-hoc slave that does just that.

Our slave will have six registers, although it occupies enough bus space that it could have a full sixteen–so there’s plenty of room should you wish to expand it. Any more than that and you’ll need to adjust the address decoding logic above. Laying these registers out, we’ll have:

  1. A read only date register

  2. A simple register that you can set and read back

  3. The address of the last bus error

  4. A counter that starts from zero on startup

  5. An experimental interrupt line, that you can use to turn an interrupt on or off, so you can see how our interface deals with an interrupt

  6. A GPIO output that you can use to communicate with the Verilator simulation. In this case, we’ll use an o_halt flag to indicate that its time for the simulation to halt. You can set that as part of the LSB of this register.

Let’s first handle the write request. On a write, to this peripheral, we’ll:

  1. Handle setting the simple register to whatever input was given

  2. Create (or clear) an interrupt depending on the low order bit if writing to register four

  3. Adjust our o_halt GPIO value with the LSB of anything written to register five.

Since the other registers are read only, we can ignore them on any write request.

This is therefore our write request logic:

always @(posedge i_clk)
	// Determine if a write to this peripheral is taking place
	if ((wb_stb)&&(smpl_sel)&&(wb_we))
	begin // Split our logic between the registers
		case(wb_addr[3:0])
		4'h1: smpl_register  <= wb_odata;
		4'h4: smpl_interrupt <= wb_odata[0];
		4'h5: o_halt         <= wb_odata[0];
		default: begin end
		endcase
	end

In this case, pay close attention to the if at the top, and the case below. These should match up with our lesson on how to build a simple wishbone slave.

Before reading back from this interface, let’s deal with the “clocks since power up” counter. We’re going to use the initial command to set this register to zero, then while running it will count up. Once the MSB gets set, we’ll leave it set so that we can tell if we’ve ever rolled over. This will give us an ever changing counter that we can use for relative timing, or absolute timing if near when the chip starts up.

// Start our clocks since power up counter from zero
initial power_counter = 0;
always @(posedge i_clk)
	// Count up from zero until the top bit is set
	if (!power_counter[31])
		power_counter <= power_counter + 1'b1;
	else // once the top bit is set, keep it set forever
		power_counter[30:0] <= power_counter[30:0] + 1'b1;

Now that we’ve created that logic, everything is ready for us to read.

always @(posedge i_clk)
	case(wb_addr[3:0])
	4'h0:    smpl_data <= 32'h20170622;
	4'h1:    smpl_data <= smpl_register;
	4'h2:    smpl_data <= { bus_err_address, 2'b00 };
	4'h3:    smpl_data <= power_counter;
	4'h4:    smpl_data <= { 31'h0, smpl_interrupt };
	default: smpl_data <= 32'h00;
	endcase

Notice that read logic doesn’t depend upon any bus lines other than the address. Indeed, the fact that a read of this device has taken place is in many ways irrelevant–only the data being produced is relevant. We can produce the right result for any address in our register space regardless. (This isn’t true for all peripherals.) The zero address of our peripheral returns a constant value (of the day when I posted this). The first address (address 4, really) just returns the register we set above. The second address gives us the address of the last bus error. The third gives us the value of our ticks since startup counter. The final register, at address position four (0x2050), just returns an LSB indicating whether or not our interrupt is set.

Since the last address in our interface only contains a halt request indicator, it will never read anything but zero, so we’re not going to include any special logic to read it.

As a final and required part of our interface, we’ll need to acknowledge the response from the bus, and create a stall line to indicate that this ad-hoc interface never stalls.

// Decoding an address takes one clock, so set the ACK to be true
// on the next clock
always @(posedge i_clk)
	smpl_ack <= ((wb_stb)&&(smpl_sel));

// This simple interface never stalls
assign smpl_stall <= 1'b0;

That’s it! You just connected a (very) simple peripheral to our debugging bus! Indeed, if you wanted, you could now use this approach to debug an FFT

Coming up

The full design, as we’ve now built it should run on any FPGA. A quick test, by giving the device an address and read request,

A2040R

or similarly a hex address and a write request,

A4000Wdeadbeef

should work to read or write from your device using this interface. Further, if nothing is going on, you should be able to see “Z”s getting sent to your screen.

Not bad.

We could even read from the scope’s control and status register with a simple read command:

A2080R

Another read command, such as

A2085R

will return one data value from our scope. You might wish to notice that we set the LSB in this address. As a result, subsequent R’s typed into the interface will interpreted as read commands from the same “scope data register” address.

Ok, so … this works. But it’s still really hard to use. This, then, gives us our roadmap forward:

  • Building a Verilator test bench to use to experiment with this bus apart from any hardware

  • Building a software interface to our debugging bus, so we can use the host CPU to our advantage when working with the design

    This software interface will also make it easier for us to get the results of any scope interactions out of our debugging harness and into a VCD type of format that we can then use to examine what was going on within the FPGA.

These then will be the topics we will queue up for a later day.