Envisioning the Ultimate I2C Controller
I wanted to share a quick design idea. I think it’s an awesome idea, but the frank reality at this point is that it is only an idea. I haven’t finished the implementation of it. Or, rather, I have most of the implementation complete, it takes less than 400 6–LUTs, and I haven’t yet verified it. Yes, verification is where the majority of the work will lie–as always in this business.
Still, the idea I’m about to share is fundamental enough it’s worth sharing.
The Problem: Telemetry
Here’s the basic problem: a client wants a small remote sensor. This sensor will be placed in a hostile environment (under water, on a rocket, in space, wherever), where it will be impossible to electronically inspect or examine it. Because the environment this sensor will be in is so hostile to electronics, it’s important to monitor the health, welfare, and status of the sensor as a whole from a remote location.
This means that the sensor needs to report back to the surface not only the sensor measurements themselves, but also other things such as: temperature (is the sensor overheating?), voltage (is it using too much power?), and other things such as air pressure or humidity. For example, if water ever got into the sensor chamber, there may be a bare minimum of time to save the sensor before it is catastrophically destroyed.
You know, just the realities of life.
All of these “extra” measurements then need to be grouped together into a telemetry stream, and forwarded from the remote sensor to an operator somewhere.
Wikipedia defines telemetry (today) as: “the in situ collection of measurements or other data at remote points and their automatic transmission to receiving equipment (telecommunication) for monitoring.”
This is a good definition for us, since this is exactly what needs to be accomplished: multiple sensors need to be read, their readings time stamped, and then returned via a data link.
For simple design reasons, it helps to share as many wires as possible between various sensors, and this has lead the hardware team to connect multiple I2C sensors across a single pair of wires common to all the sensors as well as the FPGA–which will be acting as the master and gathering the sensor data and reporting it.
So how shall such a system be implemented?
Today, therefore, we’re discussing how to implement a telemetry system composed of measurements from a set of diverse I2C sensors all on the same bus.
Here’s the first reality hit, though: every sensor is different. Some sensors have single byte addresses, some two byte addresses, and some don’t support addresses at all. This means that every sensor needs its own startup script and its own configuration.
Once configured, however, every sensor then needs to be polled at regular intervals, and not all sensors need to be polled at the same interval. In order to meet the real time requirements of the telemetry system, every frame collection must start at a specific time, and then complete by a final time. During this time, sensor measurements will be read and reported. This measure and report process will proceed ad infinitum.
The final output reporting will be done via a network data packet, where the telemetry data will be spread across every sensor data packet using only four bytes per packet. (Remember, the purpose of this system is data collection, not telemetry collection. The telemetry needs to remain a small and minor portion of this system.)
My first thought was to simply use a CPU for this purpose. Why not? The ZipCPU will play a prominent role in the system as it is, and I2C is an easy enough protocol to handle via CPU. Some might even argue that I2C is a tailor made protocol for a microcontroller implementation.
Then I got to thinking about it: CPUs aren’t known for their real time capabilities. While it is possible to create real time software, were I to do so I’d then be stuck with a CPU that could only run one program. One wrong tweak to that program and I’d have to re-verify the whole real time capability again. Worse, I’d like to reserve the CPU for ad-hoc development tasks along the way, and dedicating it to a real time processing task like this would render it unusable for other tasks.
This forces the sensor implementation into the fabric of the FPGA.
CHOICE #1: A logic based FPGA solution
Unfortunately, the logic required for a hardware based solution is … not trivial. Such a solution may need to be reconfigured often during development, as different sensor configurations are tried and tested until a final configuration is chosen to deliver to the customer.
The easiest way to build a re-configurable anything is to force the configuration to be read from memory somewhere: either flash or RAM.
CHOICE #2: The I2C logic will be scripted from memory
The next question is, how shall this data be timed? Specifically, we want every data packet to start at a timestamp, provided at a regular interval. This sensor timestamp moreover will need to be synchronized to the data collection timestamp from the rest of the system. The idea, therefore, is that measurement reading will be looped, and the top of the loop starts at a known time with respect to the rest of the systems timing.
CHOICE #3: The measurement sequence will start from an external time–stamped signal
Next design question, how shall the results be reported?
This answer is given by the rest of the design: the results will be reported via an AXI stream packet output. If necessary, I’ll might choose to switch to a modified AXI stream based packet protocol I’m using. The big difference? The modified protocol allows for a packet to be aborted if any downstream component stalls the interface by an amount greater than any buffer within it. In the absence of any abort conditions, though, it’s simply AXI stream.
CHOICE #4: Results will be reported via an AXI stream output.
Given these few choices, it’s now fairly easy to outline the form of the design.
Outlining the design
With this information, I can now outline the form this design will take in Fig. 4 below.
From this standpoint, it looks a lot like a special purpose CPU.
The first step will be to reuse a ZipCPU instruction fetch component. From here, a small state machine can handle everything else.
Indeed, I’ve now applied this approach to several projects, with generally great success.
I have a SONAR transmitter design that can send a scripted SONAR waveform from memory.
In this case, the transmitter can accept instructions from either the instruction fetch or from a slave interface. The instructions consist of 4’bit register addresses, and 28’bit values, some with side effects. This allows the CPU to either control the transmitter directly, or to give it a script to run from.
Instructions consist of things like:
Setting the amplitude
Setting an optional chirp rate, for linear FM support.
Setting the frequency. A non-zero frequency will turn the transmitter on.
Waiting for a period of time, or perhaps for an external synchronization interrupt.
Turning off the transmitter. In this case, setting a zero frequency will turn the transmitter off.
Using this approach, the transmitter can generate basic tones, linear FM, BPSK and BFSK signals, and even hyperbolic FM signals. The design is not only easy to build, low in logic cost, but it’s easy to verify as well.
This instruction fetch approach has also been a very successful part of an AXI scatter-gather DMA design I have. In this design, a small FSM (170 6-LUTs) processes the scatter-gather “table”, as it comes from the CPU fetch controller.
When the DMA is not in use, the instruction fetch is held in its reset state.
The FSM is then initialized with an address provided over an AXI-lite slave interface. This address is then fed to the instruction fetch as if the CPU waas jumping to a new address.
As table values are received, they are written to the DMA via the DMA’s AXI-lite control interface. This includes the source address, destination address, DMA length, and potentially any options to be given to the DMA, such as generating an interrupt or continuing in spite of any error–all coming from the table entry.
Two bits are stolen from each source address in the table. These bits control whether the table entry is a normal DMA entry record, whether it is the last one in the list, whether it is to be skipped, or whether the address is a link (i.e. pointer) to another table entry elsewhere in memory.
Skipping entries, or jumping to a new address is again handled like a CPU branch instruction.
One unique feature of the instruction fetch that implements a table in this manner is that you do not want any form of instruction cache. Unlike the CPU, where an instruction cache is highly desirable, applications like those above operate on memory that may have just changed, so a cache only gets in the way. Not only that, unlike a CPU, table memory like this is typically only going to be read once–so, again, a cache wouldn’t help here.
I should also point out that, the ZipCPU’s instruction AXI-lite fetch has been specifically tested with 8–bit instructions–even though the ZipCPU doesn’t have 8–bit instructions. This is one of the reasons why.
That’s the first step.
The second step is to decide on an “instruction set”. Ten instructions will be sufficient for our purposes here. Each instruction can fit in eight bits, with only the SEND and JUMP instructions requiring subsequent immediate values.
CATCH: I’m not (yet) certain what to call this instruction. My notes currently call it an ABORT instruction, but CATCH might capture it better. The idea is that this instruction sets the address to jump to should a data write request ever fail to receive an ACK–sort of like the “catch” half of a try-catch block in C++ or Java. Likewise, if for some reason arbitration is lost, this catch instruction would set the address we’d return to.
If I choose to use my modified AXI stream based packet protocol, either of these conditions would cause the outgoing packet to be aborted as well. That way the downstream packet receiver can know that the packet is being restarted from the top. Without this ABORT signal, the only way to know that a NAK or loss of I2C arbitration happened would be to receive a packet of the wrong length–knowing that the final (known packet-length) bytes would be correct.
WAIT: This instruction will cause the I2C FSM to wait for an external synchronization signal. This is how I intend to synchronize this I2C controller with the rest of the telemetry frame.
START: Now we get to more regular I2C instructions. This particular one will send an I2C start condition. If the interface isn’t idle, then it will cause a repeated start condition to be sent across the interface.
**SEND **: This command would cause the following byte in the instruction stream to be "sent" over the interface. This would include the first byte of any [I2C](https://www.i2c-bus.org/specification) transaction. Similarly, many read transactions require an address to be sent following the first byte, before sending a repeated start and receiving data, and that address could be sent via a SEND command. Finally, this command would be very useful when sending the known and pre-determined configuration to the device.
RXK (RX w/ ACK): Receive a byte of data, and ACK (acknowledge) the byte upon completion. This will signal to the slave that another byte is yet to be received following this one. Once received, the byte will be placed on the output AXI stream packet.
RXN (RX w/ NAK): Receive a byte of data, and NAK (negative acknowledge) the result. This will signal to the I2C slave that this is the last byte to be requested. As before, the byte received will be placed into an output AXI stream packet. (As an option I’m considering, this might also send a STOP command. It just depends on how complex I want to make the state machine.)
RXLK and RXLN: These two instructions mirror the RXK and RXN instructions above. They will receive a byte of data and either ACK or NAK the result, while also sending the received byte downstream. The difference between these two and their previous counterparts is that these two instructions will set the TLAST field in the AXI stream output. The result is that this byte will be marked as the last byte in the outgoing AXI stream telemetry packet.
STOP: Send an I2C stop condition.
JUMP <ADDRESS>: This is the only other instruction, aside from the SEND instruction, that gets followed by an immediate address. In this case, the immediate bytes contain a system memory bus address to jump to.
In general, the controller would JUMP to the WAIT instruction at the top of the telemetry packet, but having a specific JUMP instruction allows for more options–such as if the loop includes multiple WAIT instructions. Either way, since this instruction set has no ability for a conditional jump, the only type of loop this instruction set will support is an infinite loop. Also, since I’ve provided no HALT instruction in this list, the controller will always enter into an infinite loop.
There are many other instructions I could have added, but the application described above doesn’t require any more. For example, I could have implemented a wait-timer instruction with a two, three, or even four byte wait counter. Were I expecting bus contention, I might also consider creating an instruction sequence for a more fine grained checkpoint-restart capability, so that collisions could properly be recovered from. In this case, I don’t expect any other masters on the bus, so I see no need for such instructions yet.
For now, lets take a quick look at what an instruction stream might look like that uses this instruction set.
First, it would start with a CATCH instruction, followed by whatever instructions are necessary to configure the various sensors. Often, writing to a sensor simply consists of a series of bytes to be sent to the sensor.
Once all of the sensors have been configured, we can then switch to the operational telemetry loop. This will start with a second CATCH instruction, followed by a WAIT instruction to wait for the top of the loop.
Now it’s time to read from each sensor. Here’s a typical interaction to read three bytes from a given sensor.
Not all sensors need addresses, however. Some sensors have only one value to be read. In the following example, the sensor requires no address, yet provides the last three bytes of the sequence.
At least, this is my design idea for this problem.
As I mentioned above, I’ve been pleasantly surprised at the number of ways I’ve found to use a generic, cache-less, CPU instruction fetch module. This one capability has repaid me in reuse spades multiple times over with each application I’ve used it on: scatter gather DMAs, SONAR transmit waveform encoding, and now my draft design for an ultimate I2C controller listed above.
Reality, however, is that very few problems will limit themselves to a set of only I2C sensors. The actual problem this design was drawn from is no different. It includes not only I2C sensors, but also multiple SPI sensors–all of which when put together will make for a very diverse telemetry sensor set. Still, this same approach should also work well for scripting complex SPI sensor interactions together.
Looking over my instruction set again in hind sight, I’m tempted to split the JUMP instruction into two parts. The first part might be an instruction to set the JUMP target address, and the second part would be the JUMP instruction itself. This would not only simplify instruction decoding, but it would also make the instruction sequence easier to relocate to any address in memory–without needing to go back and fix up the jump address once a memory location was assigned to the script.
One missing component of the above explanation is the bus slave component. In this example, as with all of my examples, I like to make controllers like this controllable. Things that an external controller might do include: interrupting the script and halting the controller, issuing ad-hoc (scriptless) commands, adjusting the I2C clock timing, and replacing the script with another script. An external CPU might also decide to issue I2C commands via direct bit banging–should such a capability be implemented. Don’t forget, as well, that if these sensor commands are kept in some kind of flash memory, then the controller would need to be shut down in order to erase or program the flash memory.
A second missing component from the design above is the internal logic analyzer. Let’s just say that, because of a bad experience with a prior I2C controller, the internal logic analyzer connection will be a minimum requirement of any new implementation. The problem I had before with the last I2C controller, was that a bug in the controller tended to leave the interface mid-transaction. When I then came back later and reloaded the FPGA or otherwise reset the design, the I2C bus interaction didn’t reset–the slave remained mid-ACK and wouldn’t release the data wire. The (compressed) logic analyzer was necessary to diagnose the problem, and bit banging over the bus was necessary to fix it. Although this only lasted until I found my bug, it has left me cautious when designing I2C controllers.
For a dream cometh through the multitude of business; and a fool's voice is known by multitude of words. (Eccl 5:3)