AXI Handshaking Rules

I’m going to try to keep this article short, while still answering the question: what is the bare minimum you need to know when using an AXI handshake?

We’ll start with the basics, there are masters and slaves. “Slave” ports are those that receive data, whereas “master” ports transmit or send data to slaves.

Fig 1. AXI stream data flow direction: masters feed slaves

I’ve tended to follow the convention found in Xilinx’s examples of prefixing my master ports with M_*_ and my slave ports with S_*_. I’ll then often fill in the * part of the middle with some name reminding me which interface is being described. For example, S_VID_TVALID would be the TVALID signal found on the slave video interface. The result is a signal list, for AXI stream, looking something like that in Fig. 2 below.

Fig 2. AXI stream signals

In most cases, only the clock, reset, valid, ready, and data signals are required of an interface. In packet interfaces, or whenever using Xilinx’s stream DMAs, the TLAST signal is also required. Video interfaces also use the TUSER signal to indicate a start of frame.

The rest of the signals are optional, and I’ve rarely found a use for them.

Today, however, I want to focus on the handshake signals. Therefore we’ll group these signals into three categories: TVALID, TREADY, and we’ll lump everything else into the signal TDATA. This is simply because the handshaking signals create rules on all of the payload signals equally.

Further, while I will be discussing AXI stream handshakes today, all of our rules will also apply to AXI and AXI-lite handshakes as well. Indeed, some of the examples and illustrations I will be using further on come from non-stream designs.

The rules

So let’s start the basic handshaking rules. Indeed, I like to think of these as the bare minimum number you need to know in order to build an AXI handshake.

xVALID must be cleared following any reset.
Nothing happens unless xVALID && xREADY.

Just as a point of notation here, I’m following the AXI4 specifications convention of using xVALID to refer to an AXI stream channel of some type. In this case, I might have said M_AXIS_TVALID && M_AXIS_TREADY or S_AXIS_TVALID && S_AXIS_TREADY, but I’ve just shortened things with the abbreviation above to try to simplify things.
Something always happens anytime xVALID && xREADY – Be careful not to add any other conditions to this check lest you miss a handshake!
Nothing can change unless !xVALID || xREADY.

This is more of a master rule than one for a slave, but still quite important. We’ll come back to this more in a moment.
The xREADY signal must be registered. Use a skidbuffer if necessary to avoid any throughput impacts.

Okay, this isn’t quite what’s required by the specification. Rather, this is a consequence of what the specification does require. The specification simply requires that, “On master and slave interfaces, there must be no combinatorial paths between input and output signals.”

Fig 3. Combinatorial paths are not allowed between AXI inputs and outputs

(Recommendation only:) READY should be held high when the design is idle, and only lowered (if required) following VALID && READY.

This works great for AXI streams. It even works well for the AXI read address channel. It’s just a bit harder to do with the write address and write data channels if you don’t have a skidbuffer available to you. We’ll discuss this problem more in a moment.

Example Slave logic

If all you do is follow those basic rules, you’ll pretty much be forced into some basic logic forms. Let’s look at the form of a slave logic handler, and then we’ll look at the master logic next.

Within a slave, therefore, you are likely to have logic blocks that look like the following:

	always @(posedge ACLK)
		// Logic to determine S_AXIS_TREADY

	always @(posedge ACLK)
	if (S_AXIS_TVALID && S_AXIS_TREADY) // plus nothing!
		// Do something

Some time ago, I wrote about the problem associated with adding conditions to the handshake check. Since that time, the worst problem I’ve seen with this handshake has been with AXI memory mapped slaves that can only handle a read or a write request at a given time, but never both. So let’s discuss how to handle that situation quickly.

For example, this VHDL design is quite broken:

	FSM:
	process(STATE_cs, ...) -- not listing all items here
	  variable AW_VALID_ARVALID : std_logic_vector(1 downto 0);
	begin
		AWVALID_ARVVALID := S_AXI_AWVALID & S_AXI_ARVALID;

		case STATE_cs is
		when IDLE =>
			-- Skipping irrelevant lines ...
			S_AXI_AWREADY <= '1';
			S_AXI_ARREADY <= '1';
			-- ...
			case AWVALID_ARVALID is
			when "10" =>
				-- ...
				STATE_ns <= WRITE_ADDRESS;
			when "01" =>
				-- ...
				STATE_ns <= WRITE_ADDRESS;
			when others =>
				-- ...
				STATE_NS <= IDLE;
			end case;
		-- ...
		end case;
	end process FSM;

	-- ...

	SEQ_LOG:
	process (S_AXI_ACLK) is
	begin
		if S_AXI_ACLK'event and S_AXI_ACLK = '1' then
			if S_AXI_ARESETN = '0' then
				STATE_cs <= IDLE;
				-- ...
			else
				-- ...
				STATE_cs <= STATE_ns;
			end if;
		end if;
	end process;

Notice, here, how the designer allowed both AWVALID && AWREADY and ARVALID && ARREADY to be true at the same time. If ever the two were both true at the same time, both of the bursts would be lost.

Fig 4. If a read and write burst are both received on the same cycle, both will be dropped

My favorite approach to dealing with this situation is to use two combinatorial wires to control reading from the AW and AR skidbuffers respectively.

	// Accept writes, but only if there are no pending reads
	assign	axil_write_ready = skid_awvalid && skid_wvalid
			&& (!S_AXI_BVALID || S_AXI_BREADY)
			&& !axil_read_ready;

	assign	axil_read_ready = skid_arvalid
			&& (!S_AXI_RVALID || S_AXI_RREADY);

This approach fits nicely into the framework we’ve already established in the Easy AXI-lite template.

This isn’t the only valid approach. Many Xilinx IP’s handle this situation by quietly buffering the request that wasn’t accepted so that it can be handled later. They then come back to this buffered request once they finish handling the one they’ve chosen to handle and before raising xREADY again.

That’s how they handle it when it works.

Then there’s their AXI QUAD SPI IP design. This design tries to do something very similar, only … they weren’t consistent in how they built their logic. Hence, in this example, some of their logic prioritizes writes over reads.

	rnw_cmb <= S_AXI4_ARVALID and (not S_AXI4_AWVALID);
	-- ...
	axi_length_cmb <= S_AXI4_ARLEN (when rnw_cmb = '1')
			else
			S_AXI4_AWLEN;

But when you get to their actual state machine, they choose to process reads before writes. First, though, here’s the wr_transaction signal that you’ll see referenced by the state machine below.

wr_transaction <= S_AXI4_AWVALID and (S_AXI4_WVALID);

Now that you know what `wr_transaction is, you can look inside their state machine to see how reads get prioritized over writes:

	when IDLE =>
		if (S_AXI4_ARVALID = '1') then
			-- ...
		elsif (wr_transaction = '1') then
			-- ...
		-- ...

The result is that, if the IP ever gets both a read request and a write request at the same time, it will process and return the read request with the AXI burst parameters of the write request–such as the burst’s length, for example.

The point here is simple: If AWVALID && AWREADY or ARVALID && ARREADY then a transaction has been accepted. If both are true, then you need to make sure you are processing both transactions properly, or at least buffering one for later processing.

Example Master logic

From the perspective of a master, the logic forms are just a touch different. In this case, I’m more set in my ways. Indeed, I’ve gotten to the point where I always use the following logic form for any AXI master handshake.

	// OPT_LOWPOWER is a parameter telling me when to force unused signals
	// to a known value, to reduce any unnecessary signal toggling within
	// an FPGA.
	parameter [0:0] OPT_LOWPOWER = 1'b0;

	always @(posedge ACLK)
	if (!ARESETN)
		M_AXIS_TVALID <= 0;
	else if (!M_AXIS_TVALID || M_AXIS_TREADY)
		M_AXIS_TVALID <= next_valid_signal;

	always @(posedge ACLK)
	if (OPT_LOWPOWER && !ARESETN)
		M_AXIS_TDATA <= 0;
	else if (!M_AXIS_TVALID || M_AXIS_TREADY)
	begin
		M_AXIS_TDATA <= next_data;

		if (OPT_LOWPOWER && !next_valid)
			M_AXIS_TDATA <= 0;
	end

Of course, this assumes the existence of the (possibly combinatorial) signals next_valid_signal and next_data. No, that doesn’t mean I’m a die hard believer in two process state machines–but that’s another story for another day.

I will say that most of the AXI handshaking bugs I come across in masters come from not following this form.

For example, you can find the following logic in Xilinx’s AXI stream master template design:

	assign	axis_tlast = (read_pointer == NUMBER_OF_ITEMS-1);

	always @(posedge  ACLK)
	if (!ARESETN)
		axis_tlast_delay <= 1'b0;
	else
		axis_tlast_delay <= axis_tlast;

	assign	M_AXIS_TLAST = axis_tlast_delay;

See the bug? If not, check out Fig. 5 below.

Fig 5. Xilinx's broken AXI stream master: TLAST changes when it should be stalled

If M_AXIS_TVALID && !M_AXIS_TREADY on the penultimate beat of the burst, M_AXIS_TLAST will get set while the channel is supposed to be stalled in violation of the protocol. You might find yourself surprised later when the data packet arrives with the wrong number of items in it.

These sorts of problems aren’t limited to AXI stream designs, nor are they limited to Xilinx’s templates. For example, here’s the same sort of bug in their AXI Ethernet-lite IP. Again, this bug is due to the fact that they didn’t follow the form above.

  AXI4_RDATA_GEN : if (C_S_AXI_PROTOCOL = "AXI4") generate
      AXI_READ_OUTPUT_P: process (S_AXI_ACLK) is
      begin
          if (S_AXI_ACLK'event and S_AXI_ACLK = '1') then
              if (S_AXI_ARESETN=RST_ACTIVE) then
                  S_AXI_RDATA  <= (others =>'0');
              elsif S_AXI_RREADY = '1' then
                  S_AXI_RDATA   <= IP2Bus_Data;
              end if;
          end if;
      end process AXI_READ_OUTPUT_P;

In this case, if S_AXI_RVALID and S_AXI_RREADY are both low, the requested data will not be placed on the bus. Instead, the design will read the wrong data from the IP on the first beat of any return burst.

If you simply follow the logic templates above, you won’t make this mistake.

Capturing the rules in formal properties

These basic handshaking rules are also really easy to capture in some simple formal properties.

For example, we can check whether or not TVALID is properly reset. The first step to such a check is to assume the existence of an initial reset. For this, we can use f_past_valid. This is simply a piece of helper logic that I often create. It’s just something that is clear on the first cycle, set on every other cycle, and only used during formal verification.

	reg	f_past_valid;

	initial	f_past_valid = 0;
	always @(posedge ACLK)
		f_past_valid <= 1;

On that very first cycle, we can assume that the reset must be active.

	always @(*)
	if (!f_past_valid)
		assume(!ARESETN);

This is the only reset constraint required. Beware, though, the formal solver might toggle your reset line later when you aren’t expecting it.

We can also use f_past_valid to handle initial value checks. In particular, if f_past_valid is clear then our TVALID logic should have any initial value we’ve given to it–zero. Similarly, if the reset was active on the last cycle the TVALID should also be low.

	always @(posedge ACLK)
	if (!f_past_valid || $past(!ARESETN))
	begin
		assert(!M_AXIS_TVALID);

This check, however, assumes that you’ve set M_AXIS_TVALID to zero initially.

	initial	M_AXIS_TVALID = 1'b0;

If you’d rather not use initial values, the check is easily modified to except the first clock cycle from the check.

	always @(posedge ACLK)
	if (!f_past_valid || $past(!ARESETN))
	begin
		if (f_past_valid)
			assert(!M_AXIS_TVALID);

In this case, the assertion is only checked on clock cycles following the first one. Since the first clock cycle includes the reset, this guarantees that M_AXIS_TVALID is clear on the second clock cycle and so we know our property holds.

While AXI does not require an asynchronous reset, it does permit one. The check is easily modified to handle an environment with an asynchronous reset.

	always @(posedge ACLK)
	if (!ARESETN || $past(!ARESETN))
	begin
		assert(!M_AXIS_TVALID);

Reading this out, it says that if the reset is ever active then M_AXIS_TVALID should be clear. Likewise, if the reset was active on the last clock cycle, then M_AXIS_TVALID should also be clear.

From this beginning, we can turn our attention to the handshake itself.

Here, the rule is simple: if the stream stalled on the last cycle, then all of the values must remain the same on this cycle. That means that M_AXIS_TVALID must remain true, and everything else must remain stable.

	end else if ($past(M_AXIS_TVALID && !M_AXIS_TREADY))
	begin
		assert(M_AXIS_TVALID);
		assert($stable(M_AXIS_TDATA));

If you had more than just the M_AXIS_DATA connection, you’d also want to assert these are stable as well.

		//
		// Assert the same for any other associated
		// data that might be present: TLAST, TID,
		// TDEST, TSTRB, TKEEP, TUSER, etc.
		//
		// Only asssert the signals you actually have in
		// your interface.
		assert($stable(M_AXIS_TLAST));
		assert($stable(M_AXIS_TSTRB));
		assert($stable(M_AXIS_TKEEP));
		assert($stable(M_AXIS_TID));
		assert($stable(M_AXIS_TDEST));
		assert($stable(M_AXIS_TUSER));
	end

There. That’s all that’s required of an AXI stream handshake.

Conclusion

Getting AXI handshaking right is a basic requirement of working with anything AXI related. The logic templates above should help anyone in that journey. As you can see from the examples, however, there are plenty of ways of getting this wrong.