Swapping assumptions and assertions doesn't work

If there’s one thing I hate, it’s admitting I’m wrong. Today, I need to do that.

One of the very real and practical difficulties of working with formal methods is managing the complexity. Formal methods are roughly exponential in their complexity. Practically, this means that there is a limit to the complexity of any design that you wish to formally verify. This means that the engineer using formal tools needs to carefully consider what parts of the design to formally verify, and how to carefully break the entire design into smaller pieces that can verified individually.

I discussed one possible way of doing this before, as part of aggregating formally verified modules together. The basic approach I used was to verify the smaller “leaf” sub-modules, and then to aggregate up to larger modules. When aggregating up, I used an approach suggested to me by another: swap the assumptions with the assertions of the submodule. That way, you verify that the assumptions made within the submodule hold as part of verifying the parent module.

Pictorially, I explained this concept using Fig 1.

Fig 1. Invariants applied to a parent/leaf module hierarchy

This picture shows the verification process in two parts. The first part, on the left, formally verifies the submodule(s). These are verified as independent modules from the parent.

For the second part of the verification shown in Fig. 1, the parent modules are verified, with the submodules beneath them. In this case, the assumptions about the submodules inputs are transformed into assertions, and the assertions within the submodule are then transformed into assumptions. The idea is that if you’ve already proven that the assertions of the submodule hold given the assumptions, then all you need do is to verify the assumptions of the submodule and you then know that the assertions will pass.

I’ve now used this logic within many of my designs, and I’ve been very pleased with it.

The only problem with it? It doesn’t work. It can lead to false negatives, where you believe your design works when in actuality it does not.

Skynet

Fig 2. Skynet, the fictional AI system created by Cyberdyne Systems that tried to destroy all of humanity

Clifford Wolf was kind enough to provide the necessary counter example, framed in terms of Skynet, the fictional defense machine that decided to kill all humans as part of the back story for Terminator.

It goes like this, suppose we have a simple design that determines whether or not all humanity should be destroyed. The designers of this system were smart enough to formally verify it, and so they had a strong confidence that humanity would never be destroyed.

Because of the logic flaw we’ll be discussing today, their confidence was a false confidence.

Imagine their logic was something as simple as logic below:

module	skynet(i_clk, i_input, o_kill_all_humans);
	input	wire		i_clk;
	input	wire	[31:0]	i_input;
	output	reg		o_kill_all_humans;

	always @(*)
		o_kill_all_humans = (i_input == 32'hdeadbeef);

Since we obviously don’t want to kill everyone, we add a formal section to this Skynet logic to make certain o_kill_all_humans will never be asserted,

`ifdef	FORMAL
	always @(*)
		assume(i_input != 32'hdeadbeef);

	always @(*)
		assert(!o_kill_all_humans);
`endif
endmodule

Running SymbiYosys on this logic will assure us that Skynet will never decide to destroy all of humanity.

If only the story stopped there.

Placing Skynet in within Cyberdyne Systems

The problem lies in the next step. Using the approach we discussed before, we’ll replace the assume and assert statements with macros. These macros are defined depending upon whether or not we are formally verifying the submodule (skynet), or its parent–Cyberdyne Systems.

To do this, we’ll start by creating the various macros, depending upon whether or not SUBMODULE is defined.

`ifdef	SUBMODULE
`define	ASSUME	assume
`define	ASSERT	assert
`else
`define	ASSUME	assert
`define	ASSERT	assume
`endif

Using these macros, we can then rewrite our two formal properties:

`ifdef	FORMAL
	always @(*)
		`ASSUME(i_input != 32'hdeadbeef);

	always @(*)
		`ASSERT(!o_kill_all_humans);

Now, as long as SUBMODULE remains defined, we can still prove that Skynet will not destroy all humans.

Let’s add Skynet’s parent module, Cyberdyne Systems. We’ll keep Cyberdyne simple: there’s no logic within it at all.

module cyberdyne_systems(i_clk, i_input, o_kill_everyone);
	input	wire		i_clk;
	input	wire	[31:0]	i_input;
	output	wire		o_kill_everyone;

	skynet determine_fate_of_humanity(i_clk, i_input, o_kill_everyone);
endmodule

As you can see, nothing in the logic of Cyberdyne prevents Skynet from being corrupted and killing all humans.

By design this proof should fail, alerting us to the possibility that Skynet was never properly constrained.

It does not fail. Instead, Cyberdyne releases a monster. Why wasn’t it caught during formal verification?

What went wrong?

We did everything right so far: We formally verified that Skynet would never kill all humans as long as the input was never equal to 32'hdeadbeef. We even asserted that the input would never equal 32'hdeadbeef, and this assertion passed. We then formally verified both components together, proving that humanity was safe. However, you can see from inspection that nothing keeps the input from being 32'hdeadbeef.

What happened? Why didn’t this proof fail when it should have?

The problem is simple: the critical assertion, that o_kill_all_humans, would never be true was converted into an assumption. Assumptions, if you will recall, limit the space of what the solver examines. They take precedence over assertions. In this case, the assertion turned assumption prevented i_input from becoming 32'hdeadbeef within the solver since this would’ve caused the assumption to be violated.

Rescuing this logic

There are a couple of things we can do to try to “rescue” this method, none of them are truly as nice as the initial method appeared to be in the first place. We can:

Remove all of the logic within Skynet, leaving only the formal properties behind.

Because there’s no logic between the input assumptions, now turned into assertions, and the output assertions, now turned into assumptions, the output could be anything depending upon the choice of the solver.
Keep the logic within Skynet, replace the original submodule assumptions with an assertion, and then remove all of the local assertions.

Sadly, this approach doesn’t work with induction, as it can leave the local registers within a module unconstrained.
Replace the assumptions with assertions, but leave the original assertions untouched. Hence we would have only assert properties within the submodule design.

`define	ASSERT	assert
`ifdef	SUBMODULE
`define	ASSUME	assume
`else
`define	ASSUME	assert
`endif

This is how I am personally going to move forward when aggregating modules. It is simple and robust. The only problem is that the complexity only increases upon aggregation.

Indeed, if you apply this method to skynet.v, you’ll find that the parent proof fails (as desired), revealing that nothing is keeping Skynet from killing all humans.

If we could adjust the formal engine, there would be another approach.

Normally, during induction, all of the assertions are treated as assumptions until the last step. This allows these assertions to constrain the design until the last step, helping the design enter into a valid state during the run-up to the final stage of the proof. Then, in that final state, the assertions are applied and the design succeeds or fails.

If we could create a third type of assertion, let’s call it iassert for induction assertion, another approach might work. iassert() would be ignored during the base case prior to the induction. During the inductive step, iassert() would apply as an assumption for the first k steps of induction, and then get ignored on the last step leaving behind all of the other assertions.

Of course, this would only work if the first pass (submodule) verification evaluated iassert as though it were a true assertion.

Perhaps you are wondering how likely it is that this flawed approach, discussed above, would result in a false alarm. While I haven’t been keeping score, I’ve seen problems in many of my proofs that had used this approach. The rate may even be as high as thirty percent or more, depending upon the design. It’s high enough that you cannot ignore it. Indeed, the proof I was working on this morning passed until I “fixed” it with one of the approaches above, revealing a false alarm due to this approach.

Conclusions

The bottom line is that the approach to formal aggregation that I presented earlier doesn’t work, leading instead to a false confidence that your “formally verified” design works.

I’m slowly converting my own logic to use approach 3 above, and recommend that if you’ve been using this approach that you consider doing so as well.