This is now our third post on Linear Feedback Shift Registers (LFSRs). Our first post examined how to generate a Linear Feedback Shift Register (LFSR) in Verilog, and our second post walked through an example of a 5-bit LFSR. However, neither of these developments have solved the problem I had initially.
As you may recall, I wanted to use an LFSR to do channel estimation. My intention was to use a setup like Fig 1 to the right. My plan is to transmit pseudorandom bits out of an FPGA output pin at the fastest speed I can: 950 Mbps on my Artix-7 Arty FPGA board. I’ll then receive the bits at the other end of a 12” pmod cable representing my channel, and examine the waveform at the other end to get an estimate of the channel throughput and distortion.
To do this, though, I need a source of pseudorandom bits. Worse, our last attempt at generating pseudorandom bits only generated one bit per clock, and I will need several bits per clock in order to drive an output serializer at high speed.
So let’s return to our Fibonacci LFSR generator and see if we can modify it to produce more than one output per clock period. We’ll keep the setup identical to our first post, so the only thing that needs to change today is the number of outputs bits we need to generate. We’ll start with describing how we’ll go about getting these extra bits, and then discuss the code that implements this. We’ll also do one more: let’s formally prove at the end of our development, that our resulting implementation actually works.
Getting that next bit
Our task is create an
WS bits at a time–rather than just one. (
WS= Word Size)
The question, though, is how shall we do this?
We’ll use the
we presented earlier
for discussion and as an example along the way. You can see
in Fig 2 at the right. It’s a simple five stage
with a feedback equation defined by
Let’s begin our development by imagining an infinite stream of (constant)
bits in our shift register,
sreg. Each of these bits satisfies the
relationship given by the Fibonacci
we started with. When we presented this equation, we had bits
and we just needed to calculate the next bit,
Now let’s see what it will take to calculate bit
MSB+2. We can start with
our equation for the next bit, and apply it to bits
[MSB+1:1] to get bit
We’ll have to get rid of the reference to
sreg[MSB+1], though, before this
equation will be useful for us.
To get there, let’s split this new equation into two parts.
The first part will be due to the bits in
sreg[MSB:1]. We can represent
this part as
sreg[MSB:0]&(TAPS<<1). This represents the bits that we already
know. You can see this how this would affect our
in Fig 3. If you compare Fig 3 with Fig 2, you can see how the taps to
generate the next bit are the same as the ones shown in Fig 2, save only that
they’ve been moved one stage to the left.
The LFSR in our example is simple enough that we can apply this same technique to even get a second bit from the taps we were given, as Fig 4 below shows.
Where the example falls apart is when you need to reference a bit that isn’t present in the original shift register. For that, we need to move to the second part of our equation.
This second part,
need some work. In this case, though, we already have an equation for
sreg[MSB+1] (above). Therefore we can substitute that equation for
sreg[MSB+1] into our equation for
sreg[MSB+2] in order to get a new
sreg[MSB+2] that depends only upon
Perhaps a picture will explain this better. Consider Fig 5 below.
In this figure, you can see the remains of the taps that were being shifted
to the left, and a broken reference to a bit that isn’t in our set from
MSB:0. However, instead of adding the bit we don’t have, we instead
add the equation for that bit. The result is that our new bit, in this
depends upon three bits from our shift register, instead of just the two.
Let’s simplify this expression a touch further, though.
If we take another look at the equation above, we can now use the distributive
to collect our terms. Specifically, we’ll factor out the
sreg term to the
left, and the values multiplied by the
sreg vector together into a new
term on the right,
In this equation, the portion of our expression to the right of
& looks very much like an expression similar to our expression for
sreg[MSB+1]. Specifically, this right half expression looks like a vector
that, when taken as an inner product with
sreg, produces our result–just
TAPS vector did for
sreg[MSB+1]. Even better, this vector
is a constant–depending only upon the
TAPS parameter. Let’s use this.
tapv to be our
TAPS, and then
tapv to be this value.
We can then repeat this derivation. Doing so will reveal a formula for
tapv[k+1] based upon
tapv[k]. In particular,
tapv array is the key we need to build our code below.
A Multi-Step LFSR Implementation
Now that we have an equation for future output bits, it’s time to build our multi-step shift register. Feel free to follow along in the code we’re creating here, as we’ll only discuss the basic highlights below.
Our goal is to extend our origianal
WS bits per clock, where
WS is given by a parameter.
This means we’ll need to calculate
WS-1 more bits than we did the
last time. It also means
that our shift register, which only used to be
LN bits long, will now need
LN+(WS-1) elements in it. Of these, we’ll use the bottom
as our output bits.
But I’m getting ahead of myself. Before we can get to
o_word, there’s a
lot of other work to be done first.
We’ll start with the equations we just worked out in the last section. We’ll
tapv values into an array of
WS elements, each as
wide as our underlying
LN bits long.
The first of these equations is the one we’ve used before,
From here we can use the recursive equation derived above to get the rest,
If you didn’t manage to follow the development of these equations above, don’t worry. We’ll “prove” these equations work below in the next section.
The next trick is the reset value.
You may remember before, when we had one new bit per time step, that we set
our state space to an
This reset value then dictated the first
LN bits out of our device.
This won’t quite work for our multi-step core. First, the
vector needs to have
WS-1 additional elements to it. Then, to make matters
worse, all of those bits need to maintain the
relationship between them. So while the statement above works for the first
LN bits, we’ll still need to determine the next
We can get these next several bits of the
reset_value by running the first
several bits of the
reset_value through our
equation. This is a zero cost operation: every part of this equation
is known and constant. Therefore the synthesizer can simplify the code
before our core ever gets mapped to logic.
While this works well to develop
reset_value, it doesn’t work as well
as an initial value for our
sreg. We’d like to say,
However, this only works with some synthesizers, such as
and not with others, like
For those others, we’ll set
sreg initially to
INITIAL_FILL << (WS-1), and then
reset_value for any subsequent resets. This will still give us the
same sequence, with the only problem being that the first
will be zero instead of those associated with the fill.
Now that we have our tap equations,
tapv, and our
reset_value, we can
now move on to the state register itself. In the
all but one bit of the new state register is known. These known bits are
given by shifting the register one step to the right, whereas the last bit
is generated by the
Let’s do the same thing here and shift these known bits first,
Those are the easy bits.
The rest of the bits are calculated in the same fashion, with the exception
that the equation for them, in particular the
is different from one bit to the next. So, our next step is then to
walk through those extra bits applying the
tapv[k] equations as appropriate
to generate each new bit.
Did you notice how we also cycled through the various
This is just implementing the equations we calculated above.
Put together, we just calculated
LN-1 of these are
calculated the same as before–by shifting the shift register. The next bit
is still calculated by our original
TAPS equation, just like we did with
our initial Fibonacci implementation.
WS-1 bits were then calculated in the
exact same way with the only exception being that the equation,
changed on a bit-by-bit basis.
But, let’s come back to our earlier question, will this really “work”? Will all these equations really produce the sequence we want?
Applying Formal Methods
I discussed the basics of using
using yosys-smtbmc in a
The basic idea behind
is to define a state space, and within it the space of invalid states.
The state space is initially defined by all of the register values within
your design. It is restricted further, made smaller that is, with
statements as necessary. Invalid states are defined as well, but this time
assert statement. We’ll use these two statements,
assert with the ultimate goal of
that the logic above will never enter into an illegal state.
These formal methods are particularly appropriate for this multi-bit LFSR design, since LFSRs are so mathematically based. In particular, the output of the LFSR needs to strictly satisfy a mathematical equation–one that the formal equation solver can verify for us.
we’ll use yosys
to generate a list of properties in a
yices can understand. When processing a file in
this manner, yosys will define the
pre-processor directive for us. That allows us to begin our
verification section with an
The first step is to restrict the search state space.
We’ll use the approach
we presented before
to create a clock, and an
f_past_valid flag. This latter flag will tell
us whenever the
$past() directive will yield valid results.
Now using that assumed clock, our first
specific step will restrict the space of all possibilities by
simply assuming that
i_reset is true on startup.
Next, we’ll assert that the bottom
LN bits are equal to the
on the clock following any reset.
Now that we’ve dealt with the
i_reset case, we can move on to the regular
The first test is whether or not the first of the new register bits,
LN-1, is valid. This bit is the one produced via the
we presented before.
Here we’ll just apply it to the values left in our
shift register, plus the one we just shifted out–the one still found in
$past(sreg[WS-1]). Note that this wouldve been
$past(sreg) in our
since we are now generating another
WS-1 bits, that last bit shifted out
is no longer the
0 bit from before.
Our new implementation also defines another
WS-1 bits that we want to
validate. Because this is a very mathematically defined
there’s a mathematical relationship between all of these bits. All we
need to do to
is to check our current state bits against the equation that was supposed
to generate them one at a time.
This is the majority of the proof. If this works, all of our
will be verified.
There is one more vital part to our proof: we need to prove that
will never be equal to zero. You may recall when we
that we highlighted the problem with a
zero register: once the register
sreg becomes zero (if ever) then it will
cease to produce
numbers. We’ll allow the theorem prover
to verify that this never happens.
That’s the last of the file, and the last of the proof.
All that’s left is to run the theorem prover and see how we did!
Now that I have an LFSR implementation that I can step 8-14 times per clock, I should be able to create a high speed data stream from one FPGA and send it to another. The difference between what the second FPGA observes and what was actually sent will be any channel effects. In other words, I’m now ready to move my channel identification problem forward one step to measure just how much information can be reasonably stuffed through this FPGA I/O channel.
For man also knoweth not his time: as the fishes that are taken in an evil net, and as the birds that are caught in the snare; so are the sons of men snared in an evil time, when it falleth suddenly upon them. (Eccl 9:12)