I have been unable to find an answer, possibly due to me being unable to put specific enough nomenclature on the involved processes.
I use Vitis HLS to synthesize designs where one call of the main function is one clock cycle long, being pipelined of course. This works fine for almost all of our cases. Where this is not possible (i.e. for components where we need to guarantee certain latencies / pipelining depths) I use verilog.
The goal is to transfer data via DMA to a Zynq-7000's memory and THEN issue an interrupt to let the PS know that the DMA transfer is finished.
Suppose I have a Vitis HLS project, where the PS can initiate a DMA transfer of uint32s using (a rising edge on a signal in) an s_axilite interface to my component, like in the code below:
#include <cstdint>
void Example
(
uint32_t *dmaRegion,
bool &intrSig,
volatile bool writeNow
)
{
#pragma HLS PIPELINE II=1
#pragma HLS INLINE RECURSIVE
#pragma HLS INTERFACE s_axilite port=return bundle=registers
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE m_axi port=dmaRegion offset=slave bundle=x
#pragma HLS INTERFACE s_axilite port=dmaRegion bundle=registers
#pragma HLS INTERFACE ap_none port=dmaRegion
#pragma HLS INTERFACE s_axilite port=writeNow bundle=registers
#pragma HLS INTERFACE ap_none port=writeNow
#pragma HLS INTERFACE ap_none port=intrSig
static bool lastWriteNow { false };
static uint32_t Ctr { 0 };
bool intr = false;
if (!lastWriteNow && writeNow)
{
Ctr++;
dmaRegion[10] = Ctr;
intr = true;
}
intrSig = intr;
lastWriteNow = writeNow;
}
Now, this seems to work fine and cause a 1-clock-cycle-pulse interrupt as long as WREADY is driven high by the Zynq (and through a SmartConnect to my component) and I have found some examples where this is done this way. Also, the PS grabs the correct data from the DDR memory (L2 Data Cache has been disabled for this memory region) directly after the interrupt.
However, what will happen if for example more AXI masters are trying to drive the Smart Connect and cause congestion, effectively causing WREADY
to go low for this component? In my tests, where I drove the WREADY
signal of the AXI Smart Connect Master Interface to a constant zero to simulate (permanent) congestion, the interrupt signal (and WVALID
) was driven to a permanent high, which would mean.... what? That the HLS design blocked inside the if clause? I do not quite get it as it seems to me that this would contradict the II=1 constraint (which is reported by Vitis HLS as being satisfied).
In a way it makes sense of course, since WVALID
must go high when data is available and it must stay high until WREADY
is high as well. But why the interrupt line goes (and stays) high no matter what even though the transaction is not yet finished evades me.
Is this at all possible with any guarantees about the m_axi interface, or will I have to find other solutions?
Any hint and information (especially background information about that behaviour) is very much appreciated.
Edit:
but this causes the interrupt to stay high forever:
Of course, the transaction cannot finish. But it seems I have no way of unblocking the design so long as the AXI bus is congested.
When I compile your code and look at the schedule view this is the result:
What I understand is that there is phi node (term borrowed from LLVM) which means that the value of intrSig
can't be set before finishing the AXI4 write response. Since this is then converted into RTL the signal must have a value, and if it goes high, then there is congestion on the AXI4, it will stay high until the AXI transaction has finished.
I tried to look into the HDL, with not much luck. I only got an intuition though which I try to share:
The red wires are the ones that eventually drive the intrSig
signal. The flip flop is driven to 1 through the SET port, and to 0 by the RST port.
Long way to intrSig
from this FF, but it eventually gets there:
The SET signal is driven by combinatorial logic using writeNow
:
And lastly the wready
goes a long way but it interferes to the pipeline chain of registers that eventually drives the intrSig
.
Is this proof of what is happening? Unfortunately no, but there are some hints that the outcome of the m_axi transaction stops the interrupt pipeline to advance.
I don't know if clearing the wready
signal actually simulates congestion, the axi protocol starts with a awready
and I expect a congested interconnect to not accept transactions from the beginning.
Also, I would instantiate your IP alone, then attach some AXI VIP (axi verification IPs) which are provided in Vivado by Xilinx and programmed in SystemVerilog to give you the output you want, while recording all your data. You will also be able to look all the waveforms and detect where your issues are.
You can have your IP write into one of these AXI4VIP configured in slave mode, or you can write to a BRAM.
I'll leave here some documentation.