Formally Verifying a General Purpose Ultra-Micro Controller
At long last, I have formally verified a DDR3 SDRAM controller. Sadly, I haven’t yet had the time to verify its operation on real hardware. Worse, the I/O isn’t quite there yet so I don’t expect it will work … yet.
For those who don’t know the story, I built a DDR3 SDRAM controller years ago. These were back in my stubborn days, before I was ever willing to use any vendor modules. I then built a Wishbone controlled DDR3 SDRAM controller, and attempted to instantiate it on my Arty board.
Let’s just cut the story short: it wasn’t a successful project.
There’s a world of difference between the logic used to drive the DDR3 SDRAM and the actual wires and protocol required by the DDR3 SDRAM. That world of difference lies in the byte-strobe signals, but that’s a story for another day.
Prior to this DDR3 SDRAM controller, I had built an SDRAM controller for my XuLA2 LX25 board. It wasn’t all that had to do, and so I thought I would try my hand at a DDR3 SDRAM controller. Since this was a second generation Dynamic RAM (DRAM) controller for me, I came into this project knowing a thing or two regarding what had worked before and what hadn’t. (Or at least, so I thought.) I also came into the project expecting to run at a 200MHz system clock. Then it was a 100MHz clock, then an 80MHz clock, then … my motivation fell apart. This was to be expected in so many ways–especially since the project was never funded in the first place and it cost me several months of work.
Recently, however, I’ve had the opportunity to dig up the design again. Actually, I’ve had the opportunity to dig into both the SDRAM design and the DDR3 SDRAM design. I modified the SDRAM controller to handle ISSI’s SDRAM chip, as found on the new IceCore board from myStorm as well as started working to port my controller to the DDR3 SDRAM chip found on Lattice’s ECP5 Versa Board. That is, until my project funding ran out again.
One of the big things that has changed between my early work with DRAM controllers and now is that I’ve discovered formal verification. Neither of my original cores were formally verified initially. Indeed, one of my early formal verification stories is of trying to verify the (non-DDR) SDRAM controller and discovering a corner case where the controller might’ve accessed the wrong piece of memory.
That would’ve been hard to debug.
Today’s discussion, however, focuses on a piece of logic found within each controller–something I’m going to call an ultra-micro controller.
The Purpose of the MicroController
There are actually two ultra-micro controller’s found within each of the updated DRAM controller designs. One of them is responsible for the reset logic sequencing, and the other for the refresh logic. But I’m getting ahead of myself. Let’s start with the basic question, what is a “MicroController”?
Wikipedia currently defines a MicroController as
a small computer on a single integrated circuit on a single metal-oxide semiconductor (MOS) integrated circuit chip. In modern terminology, it is similar to, but less sophisticated than, a system on a chip (SoC); an SoC may include a microcontroller as one of its components. A microcontroller contains one or more CPUs (processor cores) along with memory and programmable input/output peripherals. … (Yes, Wikipedia’s definition even changed while I was writing this!)
This is way more sophisticated than what I am talking about today. That’s why I’m going to use the term “ultra-micro controller” today, I’m not talking about anything that would rank as I high as a full-blown CPU or even anything with external memory. Instead, I’m referring to something a little more sophisticated than a finite state machine, but not nearly as sophisticated as a full CPU. The ultra-micro controller I’m talking about today does nothing but follow a simple fixed script, setting a single register along the way. This script is the unique feature that separates it from a more traditional finite state machine.
I’m going to use this in two contexts initially, and probably many others later. These are the reset sequence and the refresh sequence of my DRAM controllers. Here’s the basic idea, as applied to the reset sequence: when you read through the data sheet for a memory chip, you’ll discover that there’s a very specific sequence of events that needs to take place from power up to the first memory access. This sequence includes timing relative signals given to the core. For example, many SDRAM chips require a PRECHARGE-ALL common (initializes all of the internal memory banks to idle), followed by a REFRESH command and then finally a SETMODE command to set the settings within the chip, telling the chip how the controller will interact with it. Only after all three of these commands have been issued is the memory ready to use.
There’s one other key detail, and that is that those three startup commands, PRECHARGE, REFRESH, and SETMODE, all have particular timing relationships between them. For example, there may need to be an adjustable number of idle cycles between the PRECHARGE command and the following REFRESH command and so forth.
That’s one of the two ultra-micro controllers I needed for these two DRAM controllers.
The other one I’m using to handle the refresh logic. If you aren’t familiar with DRAM, then you need to understand first that it’s built out of capacitors. If the capacitor is charged, that particular bit of memory is a one else its a zero. Building memory out of capacitors makes it possible to build memories with very low area and low power. The problem with using capacitors to hold memory is that the charge within any capacitor will decay over time. It leaks. If each capacitor is not re-charged periodically, they will return to zeros, losing any data you might have stored within them. To keep this from happening, the controller needs to follow an internal schedule of commands within it–periodically pulling the DRAM off line, idling all the RAM banks, issuing a REFRESH command, and then putting them back on line.
This is all well and good, but my problem was associated with the giant process it took to calculate the memory commands.
To see how bad it got, here’s a look at what my initial draft looked like:
The problem with this giant always block was that the state machine it generated was so complex it struggled to pass timing at any decent clock speed. Placing three separate state machines into this same logic block, one for reset, one for refresh, and one for reading and writing the RAM was … just too much for the hardware to handle in the time allotted to it.
I needed to simplify my design. I needed to pipeline some of this logic. (Okay, switching to a proper FSM state variable would’ve helped some as well.)
Eventually, I decided that I should separate this maintenance logic, that’s the word I’m using for both reset and refresh logic, from the rest of this giant logic block. Along the way I noticed that all of the maintenance logic was independent of any actual bus requests. I could run it a clock ahead of when it actually needed to be used. In other words, I could pre-calculate maintenance commands before they would be used and so simplify things. For example, I could have a reset logic block,
together with a separate and independent block to calculate the refresh command sequence,
I could then combine these two logic sequences together,
Why would I do all of this? To simplify the giant always block, and to
maintain a high clock
speed. Now,
instead of having multiple separate logic checks within one giant always block,
I can instead just check for the flag
maintenance_mode
which would indicate I wanted to issue either a command from the reset sequence
or the refresh sequence.
This greatly simplified the logic, but I now needed special state machines for both reset and refresh sequences.
To make this work, I just wanted a simple core that would do nothing but walk through the various reset or refresh commands in the simplest logic possible.
Hence the need for this ultra-micro controller.
The microncontroller itself worked off of a very basic 3-stage pipeline, as shown in Fig. 2. First it would calculate an address.
It would then read a command from that position in the command sequence.
Finally, it would act upon that command, setting internal registers.
Well, … not quite. Often the startup or refresh sequence would require delays between particular states. Let’s dig into that more in a moment.
For now, notice two things. First, this approach drastically simplified my processing. Second, getting this core to pass induction requires a little bit of a trick–I’ll share that in a moment.
Indeed, if it weren’t for the neat trick required to pass induction, I wouldn’t be writing this article. The core itself is embarrassingly simple.
Adding a counter
The biggest problem with the logic above is simply that the various commands also have delays and durations associated with them. For example, on ISSI’s SDRAM chip, nine clocks need to separate any REFRESH command from any command following. Our simple logic above would require 8-separate NOOP commands in our sequence to capture that.
Alternatively, we could add a second type of instruction–a “wait” instruction, with an integrated number of wait clocks. This would then become an ultra-micro controller with two op-codes: the first for setting output registers, and the second for starting a wait counter. You can see these two codes shown in Fig. 3 on the right.
This would change our logic above slightly. For example, we’d only advance the address pointer to the next instruction if the delay counter had already finished counting down to zero.
Similarly, we’d only fetch the next instruction if the delay counter were zero as well.
We’d also only update the registers controlled by this sequence if the delay counter were zero, and the next command didn’t request a delay.
With all of that out of the way, the only new feature remaining is the counter itself. In this case, on any new delay instruction we set the counter to the value given in the instruction, otherwise if the counter is greater than zero we count down.
The counter really is that basic. Indeed, perhaps you’ve noticed that many of my formal verification quizzes started out with verifying the properties of a similar counter.
At least that’s the basic design. We’ll come back to it in a moment and fill in the details.
The Problem with Induction
If you’ve been reading my blog for a while, you may remember how I’ve described the problem associated with induction before. Induction is important enough that we can’t really skip it. Without induction you cannot formally verify anything beyond the reset sequence. Without induction, you cannot prove that the properties in your design never fail.
Well, okay, that’s a debatable presumption. I should point out that there are others who don’t use induction as regularly as I do. For example, Clifford Wolf, the author of the picorv32 RISC-V CPU, Yosys, SymbiYosys, and the riscv-formal set of formal properties for RISC-V CPU verification tends not to use induction as much. Indeed, a strong argument can be made for not using induction: it’s complex, hard to learn, it requires its own way of thinking, and for certain problems a bounded model check can be equivalent.
We can illustrate this with a basic CPU. Suppose we had a CPU with a 9-stage pipeline. Chances are we could fully verify it in 20 clock cycles. I’ll admit, I’m guessing with these numbers, but there is a mathematical way to determine this distance and I think I can argue that the minimum bounded check needs to be longer than a full operation. That said, I often find myself verifying cores with minimum cycle lengths that last longer than 256 cycles, such as these AXI MM2S or S2MM cores. Simply put, induction is an important tool, but it does have its place.
When you use
induction, you
will find the traces shows the appearance of having started the design
somewhere between the initial state and eternity. However, this view of
induction
often frustrates engineers who use it, because they’ll point out that X
must happen before Y
gets set, but somehow Y
is getting set without
ever passing X
. “But I initialized that value!” is a common refrain
that I hear.
The problem is that this is not how induction actually works. Induction works by piecing together some number of logic steps for your core that don’t break any assertions, and then tries to break your assertions on the last step. It only offers the appearance that the state chosen is in the middle of time. In reality, the “initial” values in the induction trace are arbitrary–they could be anything, with the only requirements that they don’t break any assumptions, and again that they don’t break any assertions until the last step.
As a result, because the induction step doesn’t start at the initial time step, constraining every register in the design becomes important. This includes the registers set by our ultra-micro controller.
To bring this point home, I had a couple of memory rules that weren’t passing induction. Things like, REFRESH commands that could only be issued once all the banks had been placed in their idle modes, or that PRECHARGE commands could only take place no earlier than some (configurable) number of clock cycles after ACTIVATE commands, etc.
The particular problem is common among pipelined processes: Unless the various pipeline stages are so constrained, the induction engine might start in a state where the stages have no relationship to each other. The ultra-micro controller might load a command to be decoded, and then the induction engine starts with a command in that register that’s not in the command set. Or it might start outputting a value that nothing in the command set would allow. Or it might do both.
Induction can seem kind of strange in that way.
The way we’ll solve this
induction
problem is by walking a microcode address through the various pipeline stages
of processing. We’ll use one value to describe the address pipeline stage,
f_addr
, one for the command stage, f_cmd
, and one to describe the currently
active outputs, f_active
–the final pipeline stage. These three are all
addresses into our instruction stream.
We’ll set all three of these pipeline-addresses to zero initially and again upon any reset.
In all other cases, we’ll increment the address whenever we step the ultra-micro controller forward by one step.
The next step is to convert these addresses to commands. We can use a large
parameter to store these commands. If we have NCMDS
commands, each WORDWID
wide, this might look like,
We can then use a simple Verilog function to turn this command address into its associated command.
While I don’t normally use functions, in this case the function is the key to this whole algorithm. Why? Because I can now reuse the function throughout to map addresses to commands.
Because there is a single common mapping from address to command, used by both the logic of the core and the logic of the formal properties, I can assert that these copies of the various commands actually match what each stage of the algorithm should be doing.
Only one other big step is required: we’ll need to constrain all of these
addresses with respect to each other. You know, the f_cmd
address must
follow f_addr
, and f_active
must follow f_cmd
.
Of course, this is only an overview. The actual logic, below, corrects
several details. For example, assert(f_cmd + 1 == f_addr);
will fail if
f_cmd
is about to roll over to the next bit. Similarly, my reset sequence
needs to be run only once starting from a reset command, whereas the refresh
sequence needs to repeat over and over forever. The main core of the logic,
however, remains as simple as it always was.
Shall we take a peek at how this is done?
MicroController Logic
Let’s look over the basic logic of this really simple ultra-micro controller.
First, since I only want to build this once, I’ve chosen to make it highly configurable. The core accepts a parameterized number of command bits and again a parameterized number of delay bits.
Similarly, the core accepts a power of two length command set.
Every word in our command set needs to have at least CMDWIDTH
bits,
LGDELAY
bits, plus one more bit for the opcode if LGDELAY
indicates
we’ll use a delay counter in our opcode.
We can now create a fairly arbitrary set of commands using this format.
We’ll also insist that the first command of this sequence, that’s the one on the bottom, start with an output-register opcode.
Ideally, I’d like to formally verify this core against all possible command sets. However, using a parameter to set these commands really keeps me from doing this. Why then would I use a parameter to define the commands?
For a couple of reasons. First, unlike memories, (and even ROM memories!) parameters cannot change their values during induction. Second, I am depending upon the properties of the resulting sequence to then pass verification once this core is incorporated into another as a submodule. My goal is to avoid turning this module into a black box that spits out unconstrained values. The parent module will depend upon these outputs and their sequencing for a successful formal proof.
We’ll also pick, arbitrarily, that the first command will set an initial state. All values throughout our processing chain will then get reset to this initial state on any reset.
Finally, we’ll support two options: OPT_REPEAT
and OPT_DELAY
. OPT_REPEAT
,
if set, will indicate that we want to keep repeating these commands over and
over–such as the refresh cycle. If OPT_REPEAT
is clear, we’ll go through
this sequence once and stop on the last step–as the reset sequence will want
to do. OPT_DELAY
on the other hand is a simple short hand for checking
whether or not we have one or more bits allocated to the delay counter.
If OPT_DELAY
is clear, the counter will remain at zero and all commands will
proceed at the same pace.
With that aside, we can move into the design itself.
Our first task will be to create a global step signal, w_step
. If this
signal is ever true, all of our various states will step forward. Our goal
will be to set w_step
so that it is only ever true if the counter is zero,
so that it can replace the r_count == 0
logic check above and simplify
our logic.
So here’s how this works. We’ll start with the wait count kept in r_count
.
We’ll set this to zero initially so that the first command can always
contain the bits to output on any reset.
While the opcode might start with a delay, this turns out to be rather problematic, since it leaves the output registers undefined both initially and following any reset. Disallowing this simplifies our initial and reset logic therefore.
The reset should always bring this counter back to this same original state.
Every time the counter reaches (or stays at) zero, we’ll advance to the next
instruction. If that instruction, kept in r_cmd
, contains a delay
instruction then we’ll set the counter to that delay, otherwise we’ll keep it
at zero. (Remember, w_step
is equivalent to r_count == 0
.)
Finally, if r_count != 0
, we’ll decrement it until it does equal zero.
In order to stay off of the critical timing path, we’ll set w_step
so that
it’s equivalent to r_count == 0
. That means that it’s logic will (roughly)
mirror that of r_count
above, with only a few subtle differences.
While the logic above is fairly straightforward, getting it right and convincing
yourself that it’s right can be valuable. For this, we’ll set up a single
formal property to make certain that r_step
is truly equivalent to
r_count == 0
.
We can also check the counter against the currently active instruction. If the instruction is not a delay OpCode, then the counter must be zero.
Otherwise it must be less than the delay specified in the current command.
Finally, if the number of delay bits is zero, LGDELAY==0
and so !OPT_DELAY
,
then the r_step
signal should always be high.
That leaves three primary steps left to this algorithm.
The first is to calculate the next command address. Because we have chosen a super simple command set, there are no jumps, no pipeline stalls, nothing but the next address. That means we can keep this logic as simple as initializing the address to zero and then perpetually incrementing it every time we step forward.
That would work if we wanted to repeat this logic forever–such as with the
refresh sequence. If we want to stop, though–such as at the end of the reset
sequence, then we’ll need recognize a last address and stop incrementing
r_addr
once we get there. In the logic below, this last address is captured
by (&r_addr)
–the address with every bit set. If we are not repeating,
we’ll stop at this address.
The second of the three remaining steps is to read the command from our parameter set. Other than the initialization and reset logic, this is very straightforward.
Finally, the last step is to act upon the command. For this core, we’ll act
upon it by setting a number of bits in our output, o_cmd
, based upon the
values captured by the command.
Or, at least that’s the thought.
The problem is that we only want to set our output bits on a command opcode and not on a wait opcode. So we now need to check if either we aren’t implementing the wait opcode, or if the next command isn’t a wait command, before setting our output bits based upon the instruction.
But what should our initial and reset value be? Ideally, we’d want to set
o_cmd
initially to whatever was in the first instruction and then
update it on every step following.
We’ve already discussed the getcommand
function above, so we don’t
really need to discuss it more here.
Voila! That’s the logic to our ultra-micro controller: 1) calculate an instruction address, 2) read the instruction, and then 3) do what it tells you to.
Hopefully this logic appears quite straightforward. It’s supposed to be. If not, feel free to review the Verilog design yourself to see how it works. It’s supposed to be a very simple and reusable logic structure.
Formal Verification
That’s the basic logic associated with this simple ultra-micro controller, now let’s take a look at the formal properties necessary to force the induction engine into a consistent state.
There are two things to note before starting: First, this ultra-micro
controller
has only one input, i_reset
. That’s it. That means we don’t need to make
any assumptions about how the various inputs might relate to each other.
Second, our basic approach will be to shadow the ultra-micro controller’s pipeline with a set of registers used only for formal verification–a packet of information if you will. This is very similar to the way riscv-formal works: you form a packet of information describing each instruction as it goes through the pipeline. In this case, however, we’ll draw our assertions about that packet as it works its way through the pipeline rather than just at the end as riscv-formal does.
I’m also going to use the abbreviations f_addr
, f_cmd
, and f_active
to
describe the three different pipeline stages, while prior
will be used to
describe any stage prior to active
. Therefore, f_active_cmd
will refer to
the command currently being carried out, whereas f_cmd_cmd
will refer to the
command we are going to interpret next, and so forth. This will allow us to
assert, in a moment, that the output command bits are equal to either the low
order bits of f_active_cmd
, or the low order bits of the prior command,
f_prior_cmd
.
Our first step will be to get a copy of the commands that should be executing in each of the various stages of processing–as outlined in Fig. 4 above.
As discussed above, we’ll shadow the next command address with f_addr
.
Then, on each step, we’ll move push one command in and through the pipeline.
One of the hassles of dealing with addition wrapping in a finite number
of bits is that value + 1
uses one more bit than value
rather than
wrapping. To make certain that adding one to a value causes the sum to
remain within the right number of bits, we need to force it back to the
bit-width of our address.
Now that I have that, I can work through how the addresses should exist in relationship to each other. Let’s start with the case where the ultra-micro controller goes through its instructions in a one-shot (non-repeating) manner.
We start out with everything at zero, so if f_addr
is still at zero then
everything else must still be at zero.
Otherwise, if the f_cmd
stage is still at address zero, then we are one step
further–having incremented f_addr
but not yet propagated that incremented
value to f_cmd
.
Now let’s turn to look at the other end. If f_active
is all ones, then it’s
saturated at the last instruction as has everything before it.
If f_active
hasn’t yet saturated, but f_cmd
has, then f_active
will
saturate on the next clock. Of course, f_cmd
can’t get to the last value
unless f_addr
has already done so, so f_addr
must be on the last address
as well. Finally, notice how we can check f_active+1
against f_cmd
without
worrying about overflow, since f_cmd
is already the maximum positive number
and f_active
one less.
If none of those special cases are true, then we must be somewhere in the
middle: f_addr + 2 == f_cmd + 1 == f_active
.
The case where we repeat is a touch more interesting. In this case, f_active
will remain equal to f_cmd
for the first two initial clock cycles. In the
first cycle, f_addr == f_cmd == f_active == 0
. In the second cycle,
f_addr == 1
, while f_cmd == f_active == 0
.
Once we get past those two first cycles, we then know that f_addr == f_cmd+1
and f_cmd == f_active + 1
with the exception that we have to math that will
wrap around after the last instruction. This was why we created fn_cmd
and
fn_active
, and then limited them to the bare width of any address.
Now, what about our outputs? Realistically, that’s what we want to guarantee,
that we have the right outputs. Those are kept in o_cmd
, and consist of the
bottom several bits of any command word.
Here’s where all of our address work pays off. Because we’ve set
f_active_cmd
to be equal to the instruction at f_active
, we can assert that
o_cmd
matches the bits in f_active_cmd
.
Well, not quite. What if the command in the last pipeline stage is a delay command? Okay, so as long as it’s not a delay command we can make this assertion.
But what if it is a delay command? In that case, our outgoing command bits must match those of the prior command. As before, though, that’s only if the prior command isn’t also a delay. Since it’s not likely that we’ll ever want to place two delays in a row, this should be good enough.
There’s one other assertion we should make, and that is that the initial
command word is going to set o_cmd
rather than be a delay instruction.
So let’s assert that this initial opcode is a set-bits op-code, that way
we’ll know that we always have a good set of output bits from the initial
state or the reset state forward.
The Reset Sequence
Before leaving this topic, let’s take a quick peek at how this ultra-micro controller gets used in an SDRAM controller. In particular, within this controller for the iCE CORE board, you have this piece of logic.
The most difficult thing about reading this is that the commands start from low addresses and work their way up to higher addresses. The second challenging part is to realize that a command followed by a delay of zero issues the same command twice, and that a command followed by a delay of one issues it three times–so the actual number is the delay value plus two. Now, reading from the bottom up, we have:
- The first command is a NOOP,
CMD_NOOP
. It last’s for one cycle. - The second command is a PRECHARGE command,
CMD_PRECHARGE
. The final bit,1'b1
, controls the eleventh address bit indicating that this command precharges all of the banks of the SDRAM. - We then return to a NOOP command.
- We maintain that NOOP command for a one cycle delay.
- Then we issue a REFRESH command,
CMD_REFRESH
. - Set the outputs to a NOOP, and
- Wait
6+1
cycles. (Total 8-cycles of NOOP) - Issue a second REFRESH command,
CMD_REFRESH
. - Another NOOP
- Wait for
6+1
cycles again - Issue a SETMODE command,
CMD_SET_MODE
- Another NOOP
- Finally, that bit just more significant than the 4-bit SDRAM command gets released This is the flag indicating to the rest of the logic that the reset sequence is complete.
For more information on the meaning of these various commands, you should check out the data sheet for the ISSI SDRAM. In general, this sequence just follows directly from the one listed in the data sheet.
Notice also that all I’ve done is to script the startup script described in the SDRAM Spec directly into the design. Framing the problem in terms of this ultra-micro controller made it fairly easy to do.
That said, perhaps you noticed two confusing things.
-
The instructions read backwards. I may wish to change this in the future, but for now that’s how the controller works.
-
If you want to issue 7-cycles of NOOPs, you have to issue two commands. The first command switches the outgoing
reset_command
to aCMD_NOOP
, and the second command waits. As built, this is not one command Worse, a wait cycle of ‘0’ will cause the controller to issue two NOOP’s, not zero and not one: one for the command, and another for the delay instruction.
The Refresh Sequence
The other sequence I used this ultra-micro controller for is the refresh sequence. In many ways, the refresh sequence is very similar to the reset sequence. The biggest difference is that the refresh sequence repeats.
Before diving into this sequence, let me just say that SDRAM refresh control can be much more complicated than what I’m presenting here. A “smart” SDRAM controller might issue more refreshes any time the chip is idle. That way the SDRAM doesn’t necessarily need to be interrupted when busy–or at least not nearly as often.
I’m not going to over think this today. In this case, we’ll use our ultra-micro controller to drive the refresh logic.
This implementation
is a little different: the refresh command is placed at the low order bits.
The next two most significant bits are a refresh stall, and a refresh active
bit. In this case, refresh_stall
is an instruction to the bus controller
to stall any bus requests that might be coming into the controller. That allows
us to flush any active requests through the processing pipeline until the
refresh commands become active and start to override the memory control.
Let’s just scan through this command set briefly, skipping commands in our broad-brush through it:
- We start in reset. As long as the reset sequence is active, we hold the controller in reset.
- Once we come out of reset, we’ll issue a NOOP. Remember, this is also our
initial values that we set our results to during the reset cycle as well.
More importantly, we let the design run normally following the reset
without overriding any normal operation.
We’ll do this for
CK_REMAINING +2
clock cycles. (The formula for this duration is beyond the scope of this article, but easily verifiable from within the core.) - When it’s finally time to start our refresh cycle, we’ll force the Wishbone
interface to stall. We then wait
CK_WAIT_FOR_IDLE+2
cycles for any commands in process to flush through our interface and the bus to come to a halt. - Then we issue a PRECHARGE (all) command to close any active memory banks.
This takes
CK_PRECHARGE_TO_REFRESH+2
clock cycles to complete, so we issue aCMD_NOOP
during these cycles. - We can finally issue the REFRESH command. However, we can’t immediately
go back to using the interface. Instead, we need to wait
CK_REFRESH_NOOP_TO_ACTIVATE+2
cycles before the memory is (almost) ready to accept a command. - The next command clears any maintenance-induced bus stalls, and it also clears our refresh cycle allowing the rest of the memory control to return to bus control.
- Sadly, though, we need to go through all of the listed commands–even though we stopped early. Therefore, we issue idle NOOP’s until we reach the end of our list before starting over.
As with the reset sequence, most of this follows from the
ISSI SDRAM specification.
It’s timing driven, and specifically driven to issue one reset at a time
every N
cycles as determined from the
specification.
Conclusion
While I’m not going to argue that this “ultra-micro controller” forms a general purpose computer, it does have many of the attributes of a more general purpose computer: It follows a pre-defined “program” or script of instructions. Each instruction has an “OpCode”. In this case, the “OpCode” was either to output a series of bits or to wait for a given count. Unlike more general purpose computers, this “ultra-micro controller” has no ability to add, subtract, test values, or branch. Still, it was good enough for today’s purposes to be reused in two different DRAM controllers in two different ways.
Not only that, this simple example makes a great illustration of how you can create a packet of information as instructions work their way through a CPU’s pipeline, and assert consistency with that pipeline along the way. This is a very useful pattern that you will likely see time and again when formally verifying pipelines.
And she said to the king, It was a true report that I heard in mine own land of thy acts and of thy wisdom. Howbeit I believed not the words, until I came, and mine eyes had seen it: and, behold, the half was not told me: thy wisdom and prosperity exceedeth the fame which I heard. (1King 10:6-7)