Search code examples
verilogfpgaxilinxsynthesis

Verilog asynch mem in Xilinx


I am trying to create a memory shift operation in verilog and was wondering the best way to do it. An example code is:

reg [MSB:0] a [0:NO_OF_LOCATIONS];
// after some processing
for(i =0; i <= NO_OF_LOCATIONS; i= i+1)
  a[i] = a[i+1]

If I use a ROM in Xilinx it can only do synchronize writes and I need to do all the shifts within one clock cycle. If I do use a memory as above I am not sure if on board implementation will result in metastability or don't care propagation.

Also what would be the best way to do it instead of a for loop?


Solution

  • I am assuming this is part of a clocked synchronous block, i.e. something like the following (it would not make much sense otherwise and you wrote "I need to do all the shifts within one clock cycle", which implies that this is part of a synchronous design):

    reg [MSB:0] a [0:NO_OF_LOCATIONS];
    
    always @(posedge clk)
      if (...) begin
        for(i =0; i < NO_OF_LOCATIONS; i= i+1)
          a[i] <= a[i+1];
        a[NO_OF_LOCATIONS] <= ...;
      end
    
    // also use a[] somewhere
    assign a0 = a[0];
    

    Btw: a[] has NO_OF_LOCATIONS+1 locations in it. I'm not sure if this is intended but I just left it that way. Usually the range of a[] would be written as [0:NO_OF_LOCATIONS-1] for NO_OF_LOCATIONS memory locations.

    Notice that I have changed the assignment = to <=. When assigning something in a clocked always-block and the thing you assign to is read anywhere outside that always-block then you must assign non-blocking (i.e. with <=) in order to avoid race conditions in simulation that can lead to simulation-synthesis mismatches.

    Also notice that I have factored out the assignment to a[NO_OF_LOCATIONS], as it would have got its value from a[NO_OF_LOCATIONS+1], which is out-of-bounds and thus would always be undef. Without that change the synthesis tool would be right to assume that all elements of a[] are constant undef and would simply replace all reads on that array with constant 'bx.

    This is perfectly fine code. I'm not sure why you brought up metastability but as long as the ... expressions are synchronous to clk there is no metastability in this circuit. But it does not really model a memory. Well, it does, but one with NO_OF_LOCATIONS write ports and NO_OF_LOCATIONS read ports (only counting the read ports inferred by that for-loop). Even if you had such a memory: it would be very inefficient to use it like that because the expensive thing about a memory port is its capability to address any memory location, but all the ports in this example have a constant memory address (i is constant after unrolling the for-loop). So this huge number of read and write ports will force the synthesis tool to implement that memory as a pile of individual registers. However, if your target architecture has dedicated shift register resources (such as many fpgas do), then this might be transformed to a shift register.

    For large values of NO_OF_LOCATIONS you might consider implementing a fifo using cursor(s) into the memory instead of shifting the content of the entire memory from one element to the next. This would only infer one write port and no read port from inside the for-loop. For example:

    reg [MSB:0] a [0:NO_OF_LOCATIONS];
    
    parameter CURSOR_WIDTH = $clog2(NO_OF_LOCATIONS+1);
    reg [CURSOR_WIDTH-1:0] cursor = 0;
    
    always @(posedge clk)
      if (...) begin
        a[cursor] <= ...;
        cursor <= cursor == NO_OF_LOCATIONS ? 0 : cursor+1;
      end
    
    assign a0 = a[cursor];