My first experience with Formal Methods

Clifford Wolf has written a nice tool, yosys-smtbmc, based upon his yosys synthesys tool, that allows you to apply formal methods to your Verilog code. The promise of formal methods is that you can then mathematically prove that your code works, or if not then the formal solver should be able to tell you where your code is failing.

I’ve only been working with these methods for a week or so, but already there are some things I can share.

The first project I tried applying these formal methods to was a simple SPI-based A/D converter. This particular controller is designed to interact with a Digilent product containing a MEMs based microphone. I’ve had the project built for some time, although I’ll admit I’ve never actually done anything more with it than plug it in.

Since the project is simple enough, I thought I’d try applying formal methods to it, to prove that my controller worked. Sure, I had a test bench that I had built some time earlier to convince me that the project worked. The design worked well against this test bench, so I didn’t expect to find many problems using formal methods.

Imagine my surprise when I found several bugs instead. Not only that, many of those bugs were within my FIFO implementation–something that I’d passed from one project to another for some time. You see, when I built my FIFO I only tested it in a fashion such as a “reasonable” person might use it. Under this “reasonable use” scenario, the FIFO had done well.

The formal prover, however, didn’t limit itself to what I considered “reasonable” usage of the FIFO. It created underruns and overruns, wrote to the FIFO when it was full, and read from the FIFO when it was empty. It even wrote and read from the empty FIFO on the same clock, and it wrote and read from the full FIFO on the same clock as well. When the internal logic didn’t “match” the criteria I gave the solver, it then showed me where my FIFO code didn’t properly handle these conditions.

I guess I just didn’t have that much creativity when I created my test bench in the first place.

Was the result worth it? Keep in mind, I’ve never used these methods before. So did I think it was worth it? I think so. I haven’t “proved” all of my projects yet, nor do I know if I will be able to, but I have added proofs to some of them—and found bugs as a result.

Today, though, I’d like to share some of what I have learned.

State Sets

If you’ve never worked with formal methods before, the basic concept is that you will go through your code and declare which states are valid and which are not. You’ll then use a theorem prover to mathematically prove that you can never enter an invalid state from a valid one. If the prover cannot prove this, then typically you will have either a bug in your code, or a bug in your formal assertions.

You can think of the “state” as the values in all of your flip-flops, together with the values of all of your inputs.

Fig 1: Bounded Model Check

The first step in yosys-smtbmc based theorem solving is the bounded model checker (BMC), figuratively shown in Fig 1. This part of the theorem solver starts your design in its initial state, and then walks through all of the state transitions that it can, stepping your logic forward from one time step to the next, just to see if any set of conditions will drive your model to an invalid state.

This may be the most straightforward part of using yosys-smtbmc, and the easiest to understand. The problem with the BMC step is that your time is limited. Therefore, you will only want to allow the BMC step to check some finite number of transitions. This number needs to be chosen carefully, otherwise there may be states you might eventually get into over time that it won’t find.

You can also find these additional states via the second step: the induction step.

The induction step is just like the mathematical induction you learned in Pre-calculus. You first prove that some property is true for the first value, n=1. This was the purpose of the BMC step above. If it is true for this base case, you then proceed with the inductive step. This step assumes that your logic is within some initially valid state, say state n, and it then tries to prove that your logic will only transition to a valid state, say state n+1.

That’s the idea.

Fig 2: The induction step

In practice, I’ve struggled with this induction step. The challenge to the designer with the induction step is that you have to declare every unreachable state as invalid, or it might start processing from an unreachable state you aren’t expecting. As figure 2 shows, states that are neither valid nor invalid, but still states that the design will never reach, may easily become starting states for induction. It is therefore up to the designer to clearly indicate that all states must be either valid or invalid.

Fig 3: Unreachable states should be invalid

Let me try to explain this a touch better. In a moment we’ll go over some System Verilog statements that yosys will recognize. These statements declare only the state that you cannot get to. They do this by either reducing the size of the total state space examined, or by declaring particular states to be illegal. That part is the job of the formal specification designer. The job of the formal method is to determine which states may be reached, and to then cross check these states against the illegal ones.

Perhaps a simple table, such as the one in Fig 4, might help to explain this.

Fig 4: Unreachable vs Invalid states

What you want to know, as a designer, is whether or not there is any way that you might reach an invalid state from a valid one. Hence, you want to know if a particular illegal state is reachable from a valid state. This is the purpose of formal methods.

The part I struggled with while working through the induction step is that any state that isn’t declared to be invalid might be a starting point for induction–even if the state is unreachable.

This should help give you an idea, should you try working with formal methods yourself, that you need to make certain that unreachable states are either declared to be invalid or removed from the set of possible states.

Speaking of, that’s our next step: discussing how to declare states to be invalid when using yosys.

Formal Declarations

yosys can be made to understand some basic formal statements, drawn from a subset of the System Verilog formal verification language. This can be frustrating for a new user of formal methods with yosys, since most of the material on line discusses the full System Verilog formal description language subset. What yosys supports is much less than that. Basically, yosys supports assert(), assume(), restrict(), and some expressions about transitions. Let’s examine those formal verification statements here.

The first statement of interest is the assert() statement. At first blush, this statement works very much like the assert() statement within C/C++. The value given to the assert() statement must be true, or you your design isn’t working as desired. assert()s do more than that, though. assert()s declare states that are invalid within your design.

We’ll come back to that thought in a moment.

The other thing you need to know about assert()s is that within a simulation, if the assert() is ever not true, the simulation will halt on a failure. This allows you focus on what caused the problem within your design.

The next basic statement is the assume() statement. This statement is like the assert(), but with the exception that the theorem prover doesn’t try to prove that the assume() statement is true. Instead, it forces the assume() to be true. Any logic path that would cause the assume() to be false is quietly culled. From the perspective of the state space that will be examined, the assume() statement removes particular states from the realm of possibilities.

For example,

initial	last_clk = 1'b0;
always @($global_clock)
	assume(i_clk == !last_clk);

will force the i_clk input to the design to toggle with every simulation step (that’s the meaning of the $global_clock–it’s true on every change of the time step). States and state transitions where the clock doesn’t toggle are just quietly removed from the realm of possibility.

assume() statements have one more feature: if during a simulation the condition is invalid, the simulation will halt with an error just as it would if the assert() statement were false.

The restrict() instruction is similar to the assume() instruction. Like assume(), restrict() also reduces the size of the state space that the theorem prover needs to work within. In that way, a restrict statement is much like an assume() statement. However, unlike the assume() statement, the simulator will ignore any restrict() statements within your code.

I’ve also found several functions to be very valuable: $past(), $stable(), $rose(), and $fell(). Since these are important, let’s work our way through them.

The $past() function has the form of $past(expression, N), for arbitrary expressions, and positive integers, N. This statement returns the value of the expression N clocks ago. Hence, $past(expression,1) references what the expression evaluated to during the last clock. The number of clocks parameter, N, defaults to one, so $past(expression) is just the same as $past(expression,1).

System Verilog allows two more arguments to the $past() function for a total of four. Not so with yosys, however you may not find these extra arguments necessary either. For example, the fourth System Verilog argument to the $past() function specifies what clock and clock edge you are referencing the $past() expression from. Instead, yosys only allows expressions of $past() values within a clocked always block, and so it uses the clock specified in the always statement to define the clock the $past() function is relative to.

There is one other trick with the $past() function: prior to the first clock, the $past value is undefined and often assumed to be zero by many formal theorem solvers. You’ll need to be careful, therefore, not to expect the $past value to reference any initial value within your logic.

For this reason, I’ve gotten in the habit of creating a signal to tell me if the past value is valid, such as:

initial	f_past_valid = 1'b0;
always @(posedge i_clk)
	f_past_valid <= 1'b1;

I can then use f_past_valid as part of an expression to determine whether or not the inputs to a wishbone slave will not change once the strobe goes high until the stall line is low. Remember how we discussed that with wishbone nothing happens until (i_wb_stb)&&(!o_wb_stall)? This means that once the wishbone master asserts the strobe signal, i_wb_stb, that it is likely to wait for the o_wb_stall signal to lower before changing the bus request details. To capture this thought, we’ll assume that once i_wb_stb goes high, none of the bus request information will change until the clock after o_wb_stall goes low:

always @(posedge i_clk)
	if ((f_past_valid)&&($past(i_wb_stb))&&($past(o_wb_stall)))
	begin
		assume(i_wb_stb);
		assume(i_wb_we   == $past(i_wb_we));
		assume(i_wb_addr == $past(i_wb_addr));
		assume(i_wb_data == $past(i_wb_data));
		assume(i_wb_sel  == $past(i_wb_sel ));
	end

Alternatively, instead of assuming that the current value was equal to the last value, we could have instead assumed that these values were $stable() for the same effect:

always @(posedge i_clk)
	if ((f_past_valid)&&($past(i_wb_stb))&&($past(o_wb_stall)))
	begin
		assume(i_wb_stb);
		assume($stable(i_wb_we));
		assume($stable(i_wb_addr));
		assume($stable(i_wb_data));
		assume($stable(i_wb_sel));
	end

The last of the basic commands you’ll want to know is the $rose() command. This command returns true or false depending on whether the signal given to it has risen (positive edge) over the last clock period or not.

This particular function is very useful for telling the theorem prover that your inputs are only going to change on the positive edge of the clock,

always @($global_clock)
	// If it isn't the positive edge ...
	if (!$rose(i_wb_clk))
	begin
		// Then nothing changes
		assume(i_wb_stb);
		assume($stable(i_wb_we));
		assume($stable(i_wb_addr));
		assume($stable(i_wb_data));
		assume($stable(i_wb_sel));
	end

A similar $fell() primitive exists as well for testing negative clock edges.

yosys provides one more helpful feature. Anytime you use yosys to generate a formal description of your code, yosys defines the FORMAL flag. Hence, you can surround your formal properties with ifdefs, as in:

`ifdef	FORMAL
// ...
always @(posedge i_clk)
	assert(!illegal_state);
// ...
`endif

These, therefore, are your basic formal declaration statements: assert(), assume(), and restrict(). They are helped by the functions $stable(), $past(), $rose(), $fell(), and perhaps some others that I haven’t learned yet. (Remember, this is only my first week.) assert() declares statements to be illegal, and assume() and restrict() are used to limit the size of the state space.

With these basic principles, let’s look at some formal theorem proving concepts.

Basic concept

The basic approach to formally describing your program is to assume() that the inputs will be valid, and then to assert() that the outputs are valid. You may also wish along the way to assert() particular internal states are valid as well.

This works until you aggregate up one level, so that you have a higher level module instantiating a lower level one. In that case, you want to assert() the values that will be passed to the lower level module rather than assume()ing them.

A FIFO example

Let’s work our way through a simple example–that of a FIFO. In this case, we’ll examine the FIFO I just worked with for my SPI-based microphone ADC core.

As you may remember from our last FIFO discussion, a FIFO depends upon two memory pointers: the write pointer, r_first, and the read pointer, r_last. The only time r_first and r_last are equal is when the FIFO is empty. Likewise, the number of items within the FIFO is determined by r_first minus r_last.

There are some other rules associated with these two variables. The first is that we cannot be allowed to read from the FIFO any time it is empty. The second rule is that we cannot write to the FIFO when it is full, unless we are also reading from the FIFO at the same time. Any attempt to read from an empty FIFO, or write to a full FIFO should generate an error.

In order to keep the state transitions from being dependent upon the number of items in the FIFO, I pre-compute two values: will_overflow, which is true if the next write will overflow the FIFO, and will_underflow, which will be true if the next read will read from an empty FIFO.

Further, this particular FIFO also returns a status value indicating 1) how many items are in the FIFO, 2) whether or not the FIFO is non-empty, and 3) whether or not the FIFO is past half-full.

All of these details may also be derived from r_first and r_last alone. If they don’t match, the FIFO is in an illegal state. Therefore, these properties are a perfect match for learning formal methods

The first step is to gate all of our work so that the synthesis tool will ignore it, unless we are working with the formal verification methods.

`ifdef	FORMAL
// ... ALl of our formal specifications go here
`endif // This goes at the end of our formal specification

Next, within this FORMAL block we want to make certain that the assumptions we are going to make regarding our inputs will become asserts when this core is included into a larger design. We’ll do this by creating an ASSUME macro that we can use to constrain our inputs.

`ifdef	SMPLFIFO
`define	ASSUME	assume
`else
`define	ASSUME	assert
`endif

Remember that you only want to constrain via ASSUME statements about parameters that can be set by a higher level module, not external inputs.

Earlier, I described what it would take to assume that the clock toggles. We’ll repeat that here. We’re also going to insist that this FIFO starts in it’s reset state, by insisting that the i_rst line is initially valid.

initial restrict(i_rst);

always @($global_clock)
begin
	restrict(i_clk == !f_last_clk);
	f_last_clk <= i_clk;

Before leaving this beginning statement, let’s also insist that our inputs only change on the positive edge of the clock.

	if (!$rose(i_clk))
	begin
		`ASSUME($stable(i_rst));
		`ASSUME($stable(i_wr));
		`ASSUME($stable(i_data));
		`ASSUME($stable(i_rd));
	end
end

We also discussed above the need to know whether the $past() function would return valid results. We’ll use the f_past_valid register for this purpose.

initial	f_past_valid = 1'b0;
always @(posedge i_clk)
	f_past_valid <= 1'b1;

This also highlights the fact that your formal verification logic may have and use registers, just like the rest of your logic–even though you won’t be using it within your design other than for formal verification.

Next, let’s look at making sure that our helper logic works.

First, the r_fill output should be equal to the number of FIFO memory entries that are full. This is given by the difference between r_first and r_last. Above, we chose to use a clocked register to hold this value. We did this for timing reasons, but there is the possibility that we got this wrong. So, let’s check it here. Formally, one might say that the state where r_fill doesn’t equal this difference is an illegal state.

assign	f_fill = r_first - r_last;
always @(posedge i_clk)
begin
	assert(f_fill == r_fill);

Now, let’s move on to our empty flag. Any time the FIFO is empty, we want to set the will_underflow flag. In addition, we want o_empty_n to be true any time we are not empty. As I mentioned before, the FIFO is empty any time r_first is equal to r_last. We can use f_fill from above for this purpose.

	if (f_fill == 0)
	begin
		assert(will_underflow);
		assert(!o_empty_n);
	end else begin
		assert(!will_underflow);
		assert(o_empty_n);
	end

In a similar value, any time the fill is one less than the number of items in the buffer, then the buffer is “full”. Let’s make sure we got that logic right too.

		if (f_fill == {(LGFLEN){1'b1}})
			assert(will_overflow);
		else
			assert(!will_overflow);

The FIFO code also has a value, r_next, which is supposed to reference the next value to be read. To know if we got this right, let’s add one to the r_last pointer,

assign	f_next = r_last + 1'b1;

and then look to see if r_next equals this as desired,

	assert(r_next == f_next);
end

Finally, let’s examine the pointers under overflow and underflow conditions. First, on any reset, both pointers will be set to zero and any output error will be cleared.

always @(posedge i_clk)
if (f_past_valid)
begin
	if ($past(i_rst))
		assert(!o_err);
	else begin

If we are not in reset, we might have an underflow. Let’s check. If the FIFO was empty on the last clock, and if there was a request to read from the FIFO on that clock, then we had an underflow. In that case, assert that the FIFO read pointer, r_last, has not changed as a result.

		// Underflow detection
		if (($past(i_rd))&&($past(r_fill == 0)))
		begin
			// This core doesn't report underflow errors,
			// but quietly ignores them
			//
			// assert(o_err);
			//
			// On an underflow, we need to be careful not
			// to advance the pointer.
			assert(r_last == $past(r_last));
		end

Had this pointer changed, the FIFO might accidentally jump from the empty state to the full state with garbage in the buffer. We want to avoid this.

We do almost the same thing on an overflow condition, but there are just a couple differences. First, in the case of a read and write on the same clock while we are full, no overflow has taken place. Hence we need to check that a write without a read has taken place. That’s the first difference. The second difference is that this core reports overflow conditions, but not underflow conditions. Hence, we assert that o_err is true on any overflow. Likewise, on any overflow we also assert that the r_first pointer hasn’t changed.

		//
		// Overflow detection
		if (($past(i_wr))&&(!$past(i_rd))
				&&($past(will_overflow)))
		begin
			// Make sure we report this result
			assert(o_err);

			// Make sure we didn't advance our write
			// pointer on overflow
			assert(r_first == $past(r_first));
		end
	end
end

As with the underflow, a pointer change during an overflow condition can be catastrophic as well. It could cause the FIFO to suddenly become empty, losing any of the data within it. This formal check about tells the theorem prover that doing so would be illegal, and that we want to know if anything would create such a condition.

What we haven’t discussed are the instructions for installing yosys, SymbiYosys, and the various theorem provers. These may be found on line. Neither have we discussed the Makefile I used to coordinate the proof, nor the yosys config that I used. Feel free to examine these on your own if you would like.

What did I find?

You may recall from earlier that I found several errors in my FIFO as a result of using these methods. If you are reading this, then you may be wondering just how significant those errors were. Here are some of the things I found:

Originally, I had said that a write during an underflow would only take you out of the underflow condition if there was no read at the same time. This is wrong.

A write to an empty FIFO will always succeed, independent of whether or not a read is taking place at the same time, and a read from an empty FIFO should always fail–independent of a write taking place at the same time.

Fixing this required several changes throughout.
I hadn’t initialized all of my variables. In particular, r_next and r_fill weren’t originally initialized.
When calculating the number of elements in the FIFO, I had assumed that if the will_overflow value was true that any write to the FIFO would overflow the FIFO. This isn’t the case. If will_overflow is true, then any write without a concurrent read will overflow the FIFO. A read and write during the same clock period will not overflow the FIFO.

If you’d like, all of the changes are captured within the github repository on line, so you can review what I found here.

Final Thoughts

Please keep in mind, I’m only a beginner at formal methods. I’ve never used any of them before this week, but I’ve already found several problems in my own code using them. Are formal methods worth the effort? Well, for me and in this example, they were.

I’d like to come back to this topic in the future after I’ve learned some more. I’d also like to apply these methods to many other problems as well. However, I’ve also got some problems which I’m not yet certain how to prove. For example, while I’ve managed to prove a low level QSPI flash driver, I have yet to figure out how to prove the entire protocol. Likewise, I’ve managed to prove that my UART transmitter (lite) works, but not yet the receiver or the entire IP core including the wishbone interface.

For now, I think I’ll just see if I can’t prove my WB to AXI4 bridge next. I think that would be useful for all.