Search code examples
verilograce-conditionhardware-interfaceiverilog

Bit shifting in sequential block fails, in combinational not. Why?


I am debugging a piece of Verilog code for days, particularly sending and receiving bytes from an FX2LP (Cypress CY7C68016A) USB controller. Without going into many details, data is sent and transmitted byte-wise in each cycle. For my test, I use a 16 byte buffer which I first fill and then transmit back (echo test).

The significant part of my code looks like:

reg [127:0] dataBuf; // 16 byte buffer for USB data

reg [7:0] cntByte; // counter for number of bytes
reg [7:0] nextCntByte;

reg shiftBufRx, shiftBufTx; // flags whether buffer should be shifted
reg [7:0] currentByte; // current read byte

// in transmit cycle, byte is read from USB_DATAOUT
assign USB_DATAOUT = dataBuf[7:0];

always @(posedge FIFO_CLK) begin
    // update state variables
    CurrentState <= NextState;  
    cntByte <= nextCntByte;

    if(shiftBufRx) begin // cycle was a receive
        dataBuf <= { currentByte , dataBuf[127:8] };
    end
    if(shiftBufTx) begin // cycle was a transmit
        dataBuf <= { dataBuf[127-8:0] , 8'h00 };
    end
end

always @(*) begin
    // avoid race conditions
    NextState = CurrentState;
    nextCntByte = cntByte;
    nextDataBuf = dataBuf;
    currentByte = 0;
    shiftBufRx = 0;
    shiftBufTx = 0;

    case(CurrentState)
        [...]
        STATE_USBRX: begin
            if(cntByte < 16) begin
                nextCntByte = cntByte + 1;
                currentByte = USB_DATAIN; // contains received byte in receive cycle
                shiftBufRx = 1; // shift buffer after this cycle
            end
            [...]
        end
        STATE_USBTX: begin
            if(cntByte < 15) begin
                shiftBufTx = 1; // shift buffer after this cycle
                nextCntByte = cntByte + 1;
            end
            [...]
        end
        [...]
    endcase
end

This code works perfectly in simulation (iVerilog). But when synthesizing and executing on an Altera Cyclone, I get very strange errors. For example, most of the time the first byte transmitted to the FPGA is read for each byte. For example, sending 11 22 33 44 55 66 ... would receive 11 11 11 11 11 11 ....

Now when I instead introduce a new variable:

reg [127:0] nextDataBuf;

and replace the part in the sequential always @(posedge FIFO_CLK) block with:

if(shiftBufRx) begin
    dataBuf <= nextDataBuf;
end
if(shiftBufTx) begin
    dataBuf <= nextDataBuf;
end

and in the combinational part:

        STATE_USBRX: begin
            if(cntByte < 16) begin
                nextCntByte = cntByte + 1;
                //currentByte = FIFO_DATAIN;
                nextDataBuf = { dataBuf[127-8:0] , FIFO_DATAIN };
                shiftBufRx = 1;
            end
            [...]
        end
        STATE_USBTX: begin
            if(cntByte < 15) begin
                shiftBufTx = 1;
                nextCntByte = cntByte + 1;
                nextDataBuf = { 8'h00 , dataBuf[127:8] };
            end
            [...]
        end

Then it works!

That means: All I am doing is to move the shifting of the register from the sequential block to the combinational block.

I do not see any race conditions in my code and in simulation (iVerilog) both versions are identical.

What could be the reason?


Solution

  • First of all you should definitely post a MVCE. This is almost an MVCE, but it doesn't contain a testbench and the ellipsis [...] obscures things that could be relevant like the state transitions from RX to TX.

    Never the less, as it was close, I built a small test bench and added some minimal code to get it running. In my case it did NOT work "perfectly in simulation". In fact, as @greg pointed out, there is a definite error in the shift register.

    In your before code, you are shifting left out of register, but are transmitting the low bytes. The first cycle they will be 8'h11 and afterward will either be 8'h00 or stay the same, depending on when shiftBufTx is true, and this depends on logic that isn't posted. There is enough posted to see that it cannot possibly work correctly as is.

        assign USB_DATAOUT = dataBuf[7:0];
        ...
        if(shiftBufTx) begin // cycle was a transmit
            dataBuf <= { dataBuf[127-8:0] , 8'h00 };
        end
    

    In the after code, you are shifting right and this is why it works:

    nextDataBuf = { 8'h00 , dataBuf[127:8] };
    

    It is possible that the sim does not match synthesis for a similar reason. The code looks fairly synthesizable. Some suggestions:

    • There is no reason to have independent flags for Tx and Rx
    • As @serge pointed out, what happens if they conflict? This may sim okay but will not work in hardware.
    • You could just use a single flag and set it in the output of each state.
    • If you need more than one condition, you must have an else that insures that only one path can modify each non-blocking assignment to a var is active at any clock edge:

      if(shiftBufRx) begin // cycle was a receive
          dataBuf <= { currentByte , dataBuf[127:8] };
      end
      **else** if(shiftBufTx) begin // cycle was a transmit
          dataBuf <= **{ 8'h00 , dataBuf[127:8] }**;
      end
      

    Why did it appear to work in sim? Beyond small typos or transpositions of supposedly identical code, the sim is very forgiving of multiple writes to the same var (the later will "win").

    But the synthesis tool must intuit your intent into registers. Depending on how complex the logic is, it might not be able to know that both flags can't be true at the same time. Sometimes it will interpret things like this as two parallel "write enable" conditions for dataBuf and create two parallel registers. The 2nd copy probably just never shifted back out after being set to 8'h11 which matches the symptoms. You should examine the actual output circuit after elaboration to detect this kind of error. Unfortunately not all of them fail or even give obvious warnings. In Xilinx Vivado you can pull up the schematic.