Swapping assumptions and assertions doesn't work
If there’s one thing I hate, it’s admitting I’m wrong. Today, I need to do that.
One of the very real and practical difficulties of working with formal methods is managing the complexity. Formal methods are roughly exponential in their complexity. Practically, this means that there is a limit to the complexity of any design that you wish to formally verify. This means that the engineer using formal tools needs to carefully consider what parts of the design to formally verify, and how to carefully break the entire design into smaller pieces that can verified individually.
I discussed one possible way of doing this before, as part of aggregating formally verified modules together. The basic approach I used was to verify the smaller “leaf” sub-modules, and then to aggregate up to larger modules. When aggregating up, I used an approach suggested to me by another: swap the assumptions with the assertions of the submodule. That way, you verify that the assumptions made within the submodule hold as part of verifying the parent module.
Pictorially, I explained this concept using Fig 1.
This picture shows the verification process in two parts. The first part, on the left, formally verifies the submodule(s). These are verified as independent modules from the parent.
For the second part of the verification shown in Fig. 1, the parent modules are verified, with the submodules beneath them. In this case, the assumptions about the submodules inputs are transformed into assertions, and the assertions within the submodule are then transformed into assumptions. The idea is that if you’ve already proven that the assertions of the submodule hold given the assumptions, then all you need do is to verify the assumptions of the submodule and you then know that the assertions will pass.
I’ve now used this logic within many of my designs, and I’ve been very pleased with it.
The only problem with it? It doesn’t work. It can lead to false negatives, where you believe your design works when in actuality it does not.
Skynet
Clifford Wolf was kind enough to provide the necessary counter example, framed in terms of Skynet, the fictional defense machine that decided to kill all humans as part of the back story for Terminator.
It goes like this, suppose we have a simple design that determines whether or not all humanity should be destroyed. The designers of this system were smart enough to formally verify it, and so they had a strong confidence that humanity would never be destroyed.
Because of the logic flaw we’ll be discussing today, their confidence was a false confidence.
Imagine their logic was something as simple as logic below:
Since we obviously don’t want to kill everyone, we add a formal section to this
Skynet logic
to make certain o_kill_all_humans
will never be asserted,
Running SymbiYosys on this logic will assure us that Skynet will never decide to destroy all of humanity.
If only the story stopped there.
Placing Skynet in within Cyberdyne Systems
The problem lies in the next step. Using the approach we discussed before,
we’ll replace the assume
and assert
statements with macros. These
macros are defined depending upon whether or not we are
formally verifying
the submodule (skynet),
or its parent–Cyberdyne
Systems.
To do this, we’ll start by creating the various macros, depending upon whether
or not SUBMODULE
is defined.
Using these macros, we can then rewrite our two formal properties:
Now, as long as SUBMODULE remains defined, we can still prove that Skynet will not destroy all humans.
Let’s add Skynet’s parent module, Cyberdyne Systems. We’ll keep Cyberdyne simple: there’s no logic within it at all.
As you can see, nothing in the logic of Cyberdyne prevents Skynet from being corrupted and killing all humans.
By design this proof should fail, alerting us to the possibility that Skynet was never properly constrained.
It does not fail. Instead, Cyberdyne releases a monster. Why wasn’t it caught during formal verification?
What went wrong?
We did everything right so far: We
formally verified that
Skynet
would never
kill all humans as long as the input was never equal to 32'hdeadbeef
. We
even asserted that the input would never equal 32'hdeadbeef
, and this
assertion passed. We then
formally verified
both components together, proving that humanity was safe. However, you
can see from inspection that nothing keeps the input from being
32'hdeadbeef
.
What happened? Why didn’t this proof fail when it should have?
The problem is simple: the critical assertion, that o_kill_all_humans
,
would never be true was converted into an assumption. Assumptions, if you will
recall, limit the space of what the solver examines. They take precedence
over assertions. In this case, the assertion turned assumption
prevented i_input
from becoming 32'hdeadbeef
within the solver since
this would’ve caused the assumption to be violated.
Rescuing this logic
There are a couple of things we can do to try to “rescue” this method, none of them are truly as nice as the initial method appeared to be in the first place. We can:
-
Remove all of the logic within Skynet, leaving only the formal properties behind.
Because there’s no logic between the input assumptions, now turned into assertions, and the output assertions, now turned into assumptions, the output could be anything depending upon the choice of the solver.
-
Keep the logic within Skynet, replace the original submodule assumptions with an assertion, and then remove all of the local assertions.
Sadly, this approach doesn’t work with induction, as it can leave the local registers within a module unconstrained.
-
Replace the assumptions with assertions, but leave the original assertions untouched. Hence we would have only
assert
properties within the submodule design.
This is how I am personally going to move forward when aggregating modules. It is simple and robust. The only problem is that the complexity only increases upon aggregation.
Indeed, if you apply this method to
skynet.v
,
you’ll find that the
parent proof fails (as desired), revealing that nothing is keeping
Skynet from killing
all humans.
-
If we could adjust the formal engine, there would be another approach.
Normally, during induction, all of the assertions are treated as assumptions until the last step. This allows these assertions to constrain the design until the last step, helping the design enter into a valid state during the run-up to the final stage of the proof. Then, in that final state, the assertions are applied and the design succeeds or fails.
If we could create a third type of assertion, let’s call it
iassert
for induction assertion, another approach might work.iassert()
would be ignored during the base case prior to the induction. During the inductive step,iassert()
would apply as an assumption for the firstk
steps of induction, and then get ignored on the last step leaving behind all of the other assertions.Of course, this would only work if the first pass (submodule) verification evaluated
iassert
as though it were a true assertion.
Perhaps you are wondering how likely it is that this flawed approach, discussed above, would result in a false alarm. While I haven’t been keeping score, I’ve seen problems in many of my proofs that had used this approach. The rate may even be as high as thirty percent or more, depending upon the design. It’s high enough that you cannot ignore it. Indeed, the proof I was working on this morning passed until I “fixed” it with one of the approaches above, revealing a false alarm due to this approach.
Conclusions
The bottom line is that the approach to formal aggregation that I presented earlier doesn’t work, leading instead to a false confidence that your “formally verified” design works.
I’m slowly converting my own logic to use approach 3 above, and recommend that if you’ve been using this approach that you consider doing so as well.
A little leaven leaveneth the whole lump. (Gal 5:9)