What does Formal Development look like in Practice?

When I initially started working with formal verification, I used formal methods to find bugs in my designs. Now, I struggle to build a new design component without using formal methods along the way.

What does that look like?

Consider this: last night I set about to build a new SPI flash controller. This isn’t the first SPI flash controller that I’ve built. You can find my first one here. That one’s a very general purpose controller, plus Verilator-based simulator. Because it’s such a general purpose controller, it requires more logic than I often have. (Remember, I like working with really cheap, low logic boards.)

This isn’t the first time I’ve had this problem. The first time this general purpose controller didn’t fit within a design was when I was working on the CMod-S6, a “spartan” board containing a Spartan 6/LX4 FPGA and only 2400 LUTs. So I ripped out the capabilities I didn’t really need: the ability to read the device ID, or to erase and reprogram the flash. Yes, I cringed when removing these capabilities, I really wanted to keep them. However, if your hardware doesn’t have enough logic, then you’ll need to find alternate and creative solutions for many things. The controller I eventually built for that design is fast, simple and low logic. However, it requires a six-wire Quad-SPI interface.

Why is that important? In one example, when I first started working with the TinyFPGA BX it didn’t have all six Quad-SPI wires connected, only the four basic SPI wires. (Later versions now connect these last two wires.) So I rewrote the Quad-SPI interface from the CMod-S6 project to work in a dual-SPI mode. Dual SPI aggregates the MISO and MOSI wires from the SPI together, so that both go in the same direction. Hence, the Dual SPI controller should be about 2x as fast as a traditional SPI controller. This is important when doing CPU work. To handle erasing and programming the flash, I built a bit-banging interface through a control register–something the CMod S6 Quad SPI flash controller design didn’t have within it.

Sadly, even this won’t work on the ICO board–another iCE40 based design. Because the ICO board runs the wires for the flash through a MachXO2 chip, the wire directions are fixed. Hence, the ICO board can only run a straight 4-wire SPI flash interface. I also wanted a “faster” bit-banging interface than the one I had just written for the TinyFPGA board I have.

I say this all as background for why I was building a new flash controller–again.

The other reason why I share this background is so that you can place my next claim into context: using formal methods, the controller only took about four hours to build–the time between dessert and bed-time on a Friday night. (No, don’t ask why I’m doing this on a Friday night … you don’t want to know.)

So, what did this development cycle look like?

Steps in a Formal Driven Development Cycle

Since it’s fresh in my mind, and since a reader asked, let me walk you through the steps of building this brand new SPI flash controller.

I started my design work at the kitchen table, where I tried to scribble out some SystemVerilog sequence properties. When most people think of formal verification, they think of these sequences. While I have used them a bit, I’ve never used them enough to be comfortable with them. However, I had just removed the discussion of how to use sequences to verify a flash memory interaction from my formal verification courseware, since I had never used them to verify a flash memory interaction, it was complicated to examine, etc. Was this really a good choice? I wasn’t certain. (I still had other proven examples.) So I’m sure you can understand how I imagined that that actually building formal properties from sequences would help me teach this lesson later.

It was a good intention.

Instead, after I filled about a half sheet of paper with my chicken scratches before realizing that the process wasn’t going anywhere fast. Instead, I switched to the computer and typed my sequences in, got annoyed with them, rearranged them, got annoyed again, and rearranged them some more. I quickly come to the realization that it will be plain annoying to formally verify these sequences, since I’d need to move my files to a project directory on another computer having the full commercial version of yosys on it.

That’s what lead me to ultimately decide to build my design without sequences.
So I switched from working on these (so far irrelevant) properties to building the operational logic for my basic design. This took some time, as I worked with this design until it had enough capability within it that I could do some amount of formal verification work next.

When I say this took some time, I should point out that I couldn’t quite decide how the controller over-ride mode was going to work. (I needed this to be able to erase and later program the flash without complicating the controller.) In the end, I decided that I wanted to have a special control register, separate from the main flash memory area, that this register would control the CS_n pin of the flash, and that writes to this register would send eight bits and eight clocks to the device.
Once I thought I was ready to move on, I ran Verilator on the design to see if I had made any dumb mistakes. Indeed I had: 64 will not fit into a 6-bit register, among other problems Verilator found for me.

% verilator -Wall -cc spixpress.v

I then moved to the formal property section of the design, making sure it was outlined properly by ifdef’s. Since I had started with SystemVerilog sequence properties that would only work with the commercial version of yosys, I place these in a special VERIFIC section which I can come back to later.

`ifdef	FORMAL
	// Properties will soon go here
`endif
`ifdef	VERIFIC
	// Commercial grade formal properties can go here
`endif

My next step was to create an f_past_valid register for use with any $past properties I might have later.

	reg	f_past_valid;

	initial	f_past_valid = 1'b0;
	always @(posedge i_clk)
		f_past_valid <= 1'b1;

Once I had f_past_valid, the next step was to add in some properties to describe the reset condition/state. There were two parts to this process. First, I worked my way back through my code, auditing which registers really needed a reset capability and which did not. The second part was to create a basic reset property form I’ve found works in almost all of my designs. This form has worked quite well for me in the past, tying both the initial state and the reset state together and asserting that they are identical.

	always @(posedge i_clk)
	if ((!f_past_valid)||($past(i_reset)))
	begin
		assert(o_spi_cs_n == 1'b1);
		assert(o_spi_sck  == 1'b0);
		// etc.
		// Register states upon reset go here.
	end

The next step was to create a bench/formal directory, and then to copy the fwb_slave properties from the ZipCPU project to this new directory. The Wishbone properties will cause the formal tools to assume that the external Wishbone master acts “appropriately”, and assert that the responses of this module also act according to the bus specification.

I spent a moment at this point trying to deciding what parameters needed to be given to this slave property module, but ultimately end something like the following.

	localparam	F_LGDEPTH = 7;
	wire	[F_LGDEPTH-1:0]	f_nreqs, f_nacks, f_outstanding;

	fwb_slave #( .AW(23), .F_MAX_STALL(7'd70), .F_MAX_ACK_DELAY(7'd70),
			.F_LGDEPTH(F_LGDEPTH),
			.F_MAX_REQUESTS(1'b1),
			.F_OPT_MINCLOCK_DELAY(1'b1)
		) slavei(i_clk, i_reset,
		i_wb_cyc, i_wb_stb, i_wb_we, i_wb_addr, i_wb_data, 4'hf,
			o_wb_ack, o_wb_stall, o_wb_data, 1'b0,
			f_nreqs, f_nacks, f_outstanding);

Just to know this design will do something useful, I add a couple ad-hoc safety properties as well.

For example, my SCK output isn’t the true SCK for the interface. That will be created from a DDR I/O module at the top level–the only way to really get a high speed SCK clock. At this point, I wanted to make certain that SCK is off (i.e. not toggling) any time o_spi_cs_n is not asserted.

	always @(*)
	if (o_spi_cs_n)
		assert(!o_spi_sck);

I then returned to the bench/formal directory and created a SymbiYosys configuration file. Since I have no idea how well my code will or won’t work, I start it with a depth longer than the SPI operation I’m interested in (about 64 cycles), and I also start it in bmc mode only.

[options]
mode bmc
depth 70

[engines]
smtbmc

[script]
read_verilog -formal fwb_slave.v
read_verilog -formal spixpress.v
prep -top spixpress

[files]
fwb_slave.v
../../rtl/icozip/spixpress.v

I now run SymbiYosys using this configuration file.

% sby -f spixpress.sby

The design didn’t even get past the second clock. I pull up the VCD file to examine it, and discover that my design has a problem with the i_reset signal.

% gtkwave spixpress/engine_0/trace.vcd

My problem was specifically that I had wanted to create a design that would work with or without a reset signal, potentially using only the initial statements to get into the right state. However, the Wishbone property file requires a reset anytime f_past_valid is false. I grumble to myself and then grudgingly create this required reset.
I then repeat this process: creating properties, running SymbiYosys, examining the trace, adjusting my code and repeating.

Eventually I switch from mode bmc to mode prove. This yields a whole new set of strangeness within the trace, but the process remains the same.

Well, almost. There are additional properties that your design will need in order to to pass induction. As is often the case, most of the problems I discover simply require more assert() statements.

The result of this exercise is usually a jumbled mess of assertions at the bottom of an (eventually) working design. I’ll often leave these properties just like that, in a jumbled mess, until I come back later and blog about the design–if I ever do.
After doing this for a while, my design stopped failing. Since the design and its properties were at this point consistent, SymbiYosys stopped generating any trace files containing bugs to look at. When this happens it becomes harder to have any confidence that the code is (still) working. Sure, the tool says it’s working, but is it? Therefore, I added some cover properties to my design.

Unlike a safety property, such as either assume or assert, a cover property is used to prove that something could happen, or equivalently that a particular state is possible. A cover property will only succeed if the solver can find a trace, any trace, that will make the expression within it true.

If you’ve never used cover before, usually you can cover a response from the bus to get a large bang for your buck.

always @(posedge i_clk)
	cover(o_wb_ack);

Running in cover mode, however, required changing my SymbiYosys configuration file to use mode cover instead of mode prove. I then ran SymbiYosys again, this time focusing on the cover property.

% sby -f spixpress.sby

For this particular design, covering a Wishbone acknowledgement just isn’t all that satisfying. Perhaps I’ve been spoiled by cache designs, MMU’s or other flash controllers. In this case, there’s a status register which will respond almost instantly before the core does anything useful.

Therefore, I needed to create some extra logic just so I could capture just the acknowledgement I was looking for in a cover statement.

	reg	f_pending_bus_request; /// A read from the flash

	initial	f_pending_bus_request = 1'b0;
	always @(posedge i_clk)
	if ((i_reset)||(!i_wb_cyc))
		f_pending_bus_request <= 1'b0;
	else if (bus_request)
		// bus_request is a wire set to (i_wb_stb)&&(!o_wb_stall)
		//	&&(!i_wb_we)&&( not the configuration register address)
		f_pending_bus_request <= 1'b1;
	else if (o_wb_ack)
		f_pending_bus_request <= 1'b0;

	always @(posedge i_clk)
		cover((o_wb_ack)&&(f_pending_bus_request));

The design felt pretty good at this point, so I took a quick look to see how big it was in terms of logic. This meant firing up yosys, and within a yosys command line I typed:

> read_verilog spixpress.v
> synth_ice40
> stat

The design only used about 150 elements. 150 elements would fit nicely on the ICO board. It is also cheaper than many of my other designs. The last benchmark I have is about 320 cells for the Dual SPI flash controller.

Even though the design had passed formal verification, it still really wasn’t complete. In particular, it didn’t yet support pipelined bus interactions. Instead, it would read only one 32-bit value from the flash memory at any given time. Hence, any flash interaction would write the “read” command, 8'h03, to the flash, write the address, and then read the data. If you wanted to read the next data value, you would have to start back at the beginning with a new command.

Because of this ineffiiency, I like to create a pipeline capability in my controllers. Using this pipeline capaability, once you get near the end of the data request, the design will accept another Wishbone request, but only for the next memory address. That allows the controller to require 64 clocks for the initial transfer, and 32 for any subsequent transfer.

To create this new capability, I added a one-bit parameter, OPT_PIPE. Pipeline logic requires that the the o_wb_stall line be dropped before the end of the interaction, but only if the next request from the bus is for the next word from flash. Pipeline logic also requires being able to tell if the next bus request is for the next address in memory or not. Hence, I added a generate block to capture the last address logic. This has the nice feature that it only uses the additional logic any time OPT_PIPE is true.

	generate if (OPT_PIPE)
	begin
		reg	[21:0]	r_next_addr;
		always @(posedge i_clk)
		if ((i_wb_stb)&&(!o_wb_stall))
			r_next_addr <= i_wb_addr[21:0] + 1'b1;

		assign	next_addr = r_next_addr;

	end else begin

		assign next_addr = 0;

	end endgenerate

Once accomplished, I had now adjusted my logic, so I now needed to go back into my adjust my formal properties, run SymbiYosys, and view the trace development cycle.

I also added a property to make certain that, if o_wb_stall was ever dropped during a transaction, it was only dropped if there was a read request for the next data word. Sure enough, this property failed the first time around.

When the design was at the point where it passed the induction step again, I switched back to the cover mode. Specifically, I wanted to prove that the design would be able to accept a request while the SPI flash operation is ongoing.

This new cover property failed.

This was really annoying, because a failed cover statement doesn’t yield a trace. A failed assert will yield a trace, showing you what’s going wrong within your design. Not so with a failed cover. Instead, you only get a trace from a cover statement when (and if) the cover statement succeeds. Therefore, I had to create a new cover statement one step prior to the cover that failed to try to see where it failed.

Then the cycle repeated again, until I (eventually) found a careless assumption I had made early on in my design. While this assumption had made it easier to get the design off the ground, it was inappropriate for a final design. Removing this careless assumption fixed the problem, so the design then passed the updated cover check.
At this point it was now time to clean up a bit. The design had at one time passed formal verification in a non-cover mode, but it had been some time since this mode had been checked and there had been some changes. Therefore, I returned to the SymbiYosys configuration file and created a series of four tasks: Two of these tasks apply to the non pipelined version, and two apply to the pipelined version. Broken out another way, two of the tasks are induction proofs and two are cover proofs. The final SymbiYosys configuration file now looked like:

[tasks]
safety		nopipe
safety_pipe	pipe
cover		nopipe
cover_pipe	pipe

[options]
safety:      mode prove
safety_pipe: mode prove
cover:       mode cover
cover_pipe:  mode cover
depth 70

[engines]
smtbmc

[script]
read_verilog -formal fwb_slave.v
read_verilog -formal spixpress.v
cover:      chparam -set F_OPT_COVER 1 spixpress
cover_pipe: chparam -set F_OPT_COVER 1 spixpress
nopipe:	chparam -set OPT_PIPE   0 spixpress
pipe:	chparam -set OPT_PIPE   1 spixpress
prep -top spixpress

[files]
fwb_slave.v
../../rtl/icozip/spixpress.v

Sadly, some of the functionality that had worked earlier had now broken, so I went back through the cycle of updating the logic, the properties, running SymbiYosys, and looking at trace files.

This step gets difficult, though, since the SymbiYosys will print its result on the last line of processing any task, and then immediately move on to the next task. That means that if you want to know what task passed and which didn’t, you’ll need to go back and examine the various result directories.

Hence, just to make certain I hadn’t missed anything, I listed all of the SymbiYosys output directories. Specifically, I was looking for an empty file named PASS in all of the various output directories. Other possibilities would include ERROR, FAIL, and UNKNOWN. Each of these would’ve indicated a problem that I’d need to go back and revisit.

% ls spixpress*/
spixpress_cover/:
config.sby  engine_0  logfile.txt  model  PASS  spixpress_cover.xml  src

spixpress_cover_pipe/:
config.sby  engine_0  logfile.txt  model  PASS  spixpress_cover_pipe.xml  src

spixpress_safety/:
config.sby  engine_0  logfile.txt  model  PASS  spixpress_safety.xml  src

spixpress_safety_pipe/:
config.sby  engine_0  logfile.txt  model  PASS  spixpress_safety_pipe.xml  src
%

As a final step, I re-ran yosys to measure the logic usage of the design. My goal was to reassure myself that, yes, I could truly make a low logic design. Much to my surprise, the amount of logic used had jumped from 156 cells to 243. This number puzzled me, so I returned to the design and turn off the OPT_PIPE flag. The logic usage level returned to 156 cells again. I wasn’t expecting this, but at that point in the evening it was too late to chase down the cause of the difference. My design passed formal verification, and so I headed to bed.

What about those SystemVerilog sequences at the end of the file? I still haven’t tested them (yet), and at this point I’m not certain if I will.

This isn’t the end of this low-logic SPI flash module with the ICO board, but it is the end of last night’s development.

Next Steps

I do intend to come back to this design again later. It’s not done. It is however part of a longer term project associated with getting the ZipCPU up and running on the ICO board. To support that project, I’ll want to build a C++ flash controller to allow me to write programs to the flash. Sadly, this will need to be a new flash controller, since I just redesigned the interface necessary for erasing and then programming the flash.

In general, I’ve been rather disappointed that the ZipCPU that worked so nicely on several Xilinx boards has yet to work on the ICO board. I plan to discuss one of the really annoying differences in an upcoming article. However, to get to the point where the ZipCPU works on this board I will need to integrate this flash controller into the overall design, and then finish building the Verilator simulator using it. Then, and only then, after this new controller design passes my tests in simulation, supported by a to-be-written C++ controller, will I attempt to place it onto the actual ICO board hardware.

Yes, I understand there are individuals who can build and verify things using formal methods alone. These individuals talk about “formal signoff”, or being able to certify that a design is ready for tape-out using formal methods alone. I would love to be able to get there. It’s one of my personal development goals. However, I’m not there yet. Therefore, everything I do needs to be both formally verified (the easy part to set up), and then it needs to pass simulation.