Generating more than one bit at a time with an LFSR
This is now our third post on Linear Feedback Shift Registers (LFSRs). Our first post examined how to generate a Linear Feedback Shift Register (LFSR) in Verilog, and our second post walked through an example of a 5-bit LFSR. However, neither of these developments have solved the problem I had initially.
As you may recall, I wanted to use an LFSR to do channel estimation. My intention was to use a setup like Fig 1 to the right. My plan is to transmit pseudorandom bits out of an FPGA output pin at the fastest speed I can: 950 Mbps on my Artix-7 Arty FPGA board. I’ll then receive the bits at the other end of a 12” pmod cable representing my channel, and examine the waveform at the other end to get an estimate of the channel throughput and distortion.
Indeed, if all goes well I should be able to apply Shannon’s Capacity theorem to determine the maximum speed of the channel.
To do this, though, I need a source of pseudorandom bits. Worse, our last attempt at generating pseudorandom bits only generated one bit per clock, and I will need several bits per clock in order to drive an output serializer at high speed.
So let’s return to our Fibonacci LFSR generator and see if we can modify it to produce more than one output per clock period. We’ll keep the setup identical to our first post, so the only thing that needs to change today is the number of outputs bits we need to generate. We’ll start with describing how we’ll go about getting these extra bits, and then discuss the code that implements this. We’ll also do one more: let’s formally prove at the end of our development, that our resulting implementation actually works.
Getting that next bit
Our task is create an
LFSR
that produces WS
bits at a time–rather than just one. (WS
= Word Size)
The question, though, is how shall we do this?
We’ll use the
example
LFSR
we presented earlier
for discussion and as an example along the way. You can see
this example
in Fig 2 at the right. It’s a simple five stage
LFSR,
with a feedback equation defined by TAPS=5'b00101
.
Let’s begin our development by imagining an infinite stream of (constant)
bits in our shift register, sreg
. Each of these bits satisfies the
relationship given by the Fibonacci
GF(2) equation
we started with. When we presented this equation, we had bits MSB:0
defined
and we just needed to calculate the next bit, MSB+1
,
You may also remember, from the discussion of an
example
LFSR, that these
MSB:0
bits have no required relationship between them–save that they cannot
all be zero.
Now let’s see what it will take to calculate bit MSB+2
. We can start with
our equation for the next bit, and apply it to bits [MSB+1:1]
to get bit
MSB+2
,
We’ll have to get rid of the reference to sreg[MSB+1]
, though, before this
equation will be useful for us.
To get there, let’s split this new equation into two parts.
The first part will be due to the bits in sreg[MSB:1]
. We can represent
this part as sreg[MSB:0]&(TAPS<<1)
. This represents the bits that we already
know. You can see this how this would affect our
example
LFSR
in Fig 3. If you compare Fig 3 with Fig 2, you can see how the taps to
generate the next bit are the same as the ones shown in Fig 2, save only that
they’ve been moved one stage to the left.
The LFSR in our example is simple enough that we can apply this same technique to even get a second bit from the taps we were given, as Fig 4 below shows.
Where the example falls apart is when you need to reference a bit that isn’t present in the original shift register. For that, we need to move to the second part of our equation.
This second part, sreg[MSB+1]&TAPS[MSB]
will
need some work. In this case, though, we already have an equation for
sreg[MSB+1]
(above). Therefore we can substitute that equation for
sreg[MSB+1]
into our equation for sreg[MSB+2]
in order to get a new
expression for sreg[MSB+2]
that depends only upon sreg[MSB:0]
,
Perhaps a picture will explain this better. Consider Fig 5 below.
In this figure, you can see the remains of the taps that were being shifted
to the left, and a broken reference to a bit that isn’t in our set from
MSB:0
. However, instead of adding the bit we don’t have, we instead
add the equation for that bit. The result is that our new bit, in this
example
LFSR
depends upon three bits from our shift register, instead of just the two.
Let’s simplify this expression a touch further, though.
If we take another look at the equation above, we can now use the distributive
property
to collect our terms. Specifically, we’ll factor out the sreg
term to the
left, and the values multiplied by the sreg
vector together into a new
term on the right,
In this equation, the portion of our expression to the right of
the &
looks very much like an expression similar to our expression for
sreg[MSB+1]
. Specifically, this right half expression looks like a vector
that, when taken as an inner product with sreg
, produces our result–just
like the TAPS
vector did for sreg[MSB+1]
. Even better, this vector
is a constant–depending only upon the TAPS
parameter. Let’s use this.
We’ll define tapv[0]
to be our TAPS
, and then tapv[1]
to be this value.
We can then repeat this derivation. Doing so will reveal a formula for
tapv[k+1]
based upon tapv[k]
. In particular,
This tapv
array is the key we need to build our code below.
A Multi-Step LFSR Implementation
Now that we have an equation for future output bits, it’s time to build our multi-step shift register. Feel free to follow along in the code we’re creating here, as we’ll only discuss the basic highlights below.
Our goal is to extend our origianal
Fibonacci
LFSR
code
to output WS
bits per clock, where WS
is given by a parameter.
This means we’ll need to calculate WS-1
more bits than we did the
last time. It also means
that our shift register, which only used to be LN
bits long, will now need
to have LN+(WS-1)
elements in it. Of these, we’ll use the bottom WS
bits
as our output bits.
But I’m getting ahead of myself. Before we can get to o_word
, there’s a
lot of other work to be done first.
We’ll start with the equations we just worked out in the last section. We’ll
place these tapv[]
values into an array of WS
elements, each as
wide as our underlying
LFSR,
or LN
bits long.
The first of these equations is the one we’ve used before, tapv[0]
,
From here we can use the recursive equation derived above to get the rest,
If you didn’t manage to follow the development of these equations above, don’t worry. We’ll “prove” these equations work below in the next section.
The next trick is the reset value.
You may remember before, when we had one new bit per time step, that we set
our state space to an INITIAL_FILL
value.
This reset value then dictated the first LN
bits out of our device.
This won’t quite work for our multi-step core. First, the INITIAL_FILL
vector needs to have WS-1
additional elements to it. Then, to make matters
worse, all of those bits need to maintain the
LFSR
relationship between them. So while the statement above works for the first
LN
bits, we’ll still need to determine the next WS-1
bits.
We can get these next several bits of the reset_value
by running the first
several bits of the reset_value
through our
Fibonacci
LFSR
equation. This is a zero cost operation: every part of this equation
is known and constant. Therefore the synthesizer can simplify the code
before our core ever gets mapped to logic.
While this works well to develop reset_value
, it doesn’t work as well
as an initial value for our sreg
. We’d like to say,
However, this only works with some synthesizers, such as
yosys,
and not with others, like
Verilator.
For those others, we’ll set sreg
initially to
INITIAL_FILL << (WS-1)
, and then
use reset_value
for any subsequent resets. This will still give us the
same sequence, with the only problem being that the first WS-1
values
will be zero instead of those associated with the fill.
Now that we have our tap equations, tapv[]
, and our reset_value
, we can
now move on to the state register itself. In the
typical
LFSR,
all but one bit of the new state register is known. These known bits are
given by shifting the register one step to the right, whereas the last bit
is generated by the TAPS
equation.
Let’s do the same thing here and shift these known bits first,
Those are the easy bits.
After those easy bits, our first next bit is determined by the original Fibonacci LFSR equation. That’s the one we built into our first Fibonacci LFSR module.
The rest of the bits are calculated in the same fashion, with the exception
that the equation for them, in particular the tapv[k]
coefficients,
is different from one bit to the next. So, our next step is then to
walk through those extra bits applying the tapv[k]
equations as appropriate
to generate each new bit.
Did you notice how we also cycled through the various tapv[k]
expressions?
This is just implementing the equations we calculated above.
That’s it. It’s still simple, but there is just a little more simple involved in this version over the last version.
Put together, we just calculated LN+WS-1
bits. LN-1
of these are
calculated the same as before–by shifting the shift register. The next bit
is still calculated by our original TAPS
equation, just like we did with
our initial Fibonacci implementation.
The last WS-1
bits were then calculated in the
exact same way with the only exception being that the equation, tapv[k]
,
changed on a bit-by-bit basis.
But, let’s come back to our earlier question, will this really “work”? Will all these equations really produce the sequence we want?
To answer that question, let’s see if we can get the computer to “prove” that this new form works.
Applying Formal Methods
I discussed the basics of using
formal methods
using yosys-smtbmc in a
previous post.
The basic idea behind
formal methods
is to define a state space, and within it the space of invalid states.
The state space is initially defined by all of the register values within
your design. It is restricted further, made smaller that is, with assume
statements as necessary. Invalid states are defined as well, but this time
using the assert
statement. We’ll use these two statements, assume
and
assert
with the ultimate goal of
proving
that the logic above will never enter into an illegal state.
These formal methods are particularly appropriate for this multi-bit LFSR design, since LFSRs are so mathematically based. In particular, the output of the LFSR needs to strictly satisfy a mathematical equation–one that the formal equation solver can verify for us.
As before,
we’ll use yosys
to generate a list of properties in a
format that
yices can understand. When processing a file in
this manner, yosys will define the FORMAL
pre-processor directive for us. That allows us to begin our FORMAL
verification section with an ifdef
,
Our plan will be to then use yosys-smtbmc to then drive the yices theorem prover to prove that our multi-step LFSR actually works.
The first step is to restrict the search state space.
We’ll use the approach
we presented before
to create a clock, and an f_past_valid
flag. This latter flag will tell
us whenever the $past()
directive will yield valid results.
Now using that assumed clock, our first
LFSR
specific step will restrict the space of all possibilities by
simply assuming that i_reset
is true on startup.
Next, we’ll assert that the bottom LN
bits are equal to the INITIAL_FILL
on the clock following any reset.
Now that we’ve dealt with the i_reset
case, we can move on to the regular
register bits.
The first test is whether or not the first of the new register bits,
bit LN-1
, is valid. This bit is the one produced via the
original Fibonacci
configuration
we presented before.
Here we’ll just apply it to the values left in our
shift register, plus the one we just shifted out–the one still found in
$past(sreg[WS-1])
. Note that this wouldve been $past(sreg[0])
in our
previous
version, but
since we are now generating another WS-1
bits, that last bit shifted out
is no longer the 0
bit from before.
Our new implementation also defines another WS-1
bits that we want to
validate. Because this is a very mathematically defined
LFSR,
there’s a mathematical relationship between all of these bits. All we
need to do to
formally validate
this code
is to check our current state bits against the equation that was supposed
to generate them one at a time.
This is the majority of the proof. If this works, all of our tapv[]
work
will be verified.
There is one more vital part to our proof: we need to prove that sreg
will never be equal to zero. You may recall when we
last discussed
LFSRs
that we highlighted the problem with a
zero register: once the register sreg
becomes zero (if ever) then it will
cease to produce
pseudorandom
numbers. We’ll allow the theorem prover
to verify that this never happens.
That’s the last of the file, and the last of the proof.
All that’s left is to run the theorem prover and see how we did!
In case you haven’t been following along, you can find the example file we just created here. You can also find the Makefile and other files associated with driving the formal theorem prover here.
Conclusion
Now that I have an LFSR implementation that I can step 8-14 times per clock, I should be able to create a high speed data stream from one FPGA and send it to another. The difference between what the second FPGA observes and what was actually sent will be any channel effects. In other words, I’m now ready to move my channel identification problem forward one step to measure just how much information can be reasonably stuffed through this FPGA I/O channel.
For man also knoweth not his time: as the fishes that are taken in an evil net, and as the birds that are caught in the snare; so are the sons of men snared in an evil time, when it falleth suddenly upon them. (Eccl 9:12)