Ive been trying to modify the rocc interface of the rocket core and I've currently modified the rocc interface to work as a scratchpad from where we can load and store data using the custom0 instruction. Im facing an issue when i try to push and pop data into a stack memory that i have created in chisel and instantiated inside my scratchpad. Im using the same custom0 instruction with different values of funct field to push and pop to the stack.
The code in chisel is as shown below for both scratchpad and my stack
class Comm_Scratchpad(n: Int = 8)(implicit p: Parameters) extends RoCC()(p) {
val Stack_Lo = Module(new Stack_Snd(64) )
val Scratchpad_Loc = Mem(UInt(width = xLen), n:Int)
val busy = Reg(init=Vec(Bool(false), n))
val cmd = Queue(io.cmd) //wired to decoupled command coming into rocc interface from rocket core
val funct = cmd.bits.inst.funct //Function to decide what operation to perform
val addr = cmd.bits.inst.rs2(log2Up(n)-1,0) //converts address specified by user(0,1,2,3) into address that can be used for scratchpad
val doWrite = funct === UInt(0) //Used to check whether user wants to do a write derieves its value from funct
val doRead = funct === UInt(1) //Ignored for now as every operation is by default performing a read
val doPush = funct === UInt(2) //Funct will be used to load to lifo with the same custom instruction
val doPop = funct === UInt(3) //funct will be used to load back from lifo
Stack_Lo.io.en := Mux((doPush || doPop), Bool(true), Bool(false))
Stack_Lo.io.push := Mux(doPush, Bool(true), Bool(false))
Stack_Lo.io.pop := Mux(doPop, Bool(true), Bool(false))
// datapath
val data_in = cmd.bits.rs1
val wdata = data_in
val rdata = Mux(doPop, Stack_Lo.io.dataOut, Scratchpad_Loc(addr))
Stack_Lo.io.dataIn := Mux(doPush, data_in, UInt(0))
when (cmd.fire() && (doWrite)) {
Scratchpad_Loc(addr) := wdata
}
val doResp = cmd.bits.inst.xd
val stallReg = busy(addr)
//val stallLoad = doLoad && !io.mem.req.ready
val stallResp = doResp && !io.resp.ready && (doPush || doPop)
cmd.ready := !stallReg && !stallResp //removed stall load as we are not loading from memory
// command resolved if no stalls AND not issuing a load that will need a request
// PROC RESPONSE INTERFACE
io.resp.valid := cmd.valid && doResp && !stallReg //&& !stallLoad
// valid response if valid command, need a response, and no stalls
io.resp.bits.rd := cmd.bits.inst.rd
// Must respond with the appropriate tag or undefined behavior
io.resp.bits.data := rdata
// Semantics is to always send out prior accumulator register value
io.busy := cmd.valid || busy.reduce(_||_)
// Be busy when have pending memory requests or committed possibility of pending requests
io.interrupt := Bool(false)
// Set this true to trigger an interrupt on the processor (please refer to supervisor documentation)
}
class Stack_IO (implicit p: Parameters) extends CoreBundle
{
val dataIn = UInt(INPUT, 64)
val dataOut = UInt(OUTPUT, 64)
val push = Bool(INPUT)
val pop = Bool(INPUT)
val en = Bool(INPUT)
}
class Stack_Snd(depth: Int) (implicit p: Parameters) extends CoreModule {
val io = new Stack_IO;
// declare the memory for the stack
val stack_mem =
Mem(UInt(width = 64), depth)
val sp = Reg(init = UInt(0, width = log2Up(depth)))
val dataOut = Reg(init = UInt(0, width = 64))
// Push condition - make sure stack isn't full
when(io.en && io.push && (sp != UInt(depth-1))) {
stack_mem(sp) := io.dataIn
sp := sp + UInt(1)
}
// Pop condition - make sure the stack isn't empty
.elsewhen(io.en && io.pop && (sp > UInt(0))) {
dataOut := stack_mem(sp - UInt(1))
sp := sp - UInt(1)
}
io.dataOut := dataOut
}
the C code that I'm executing in the front end server is as below.
// The following is a RISC-V program to test the functionality of the
// dummy RoCC accelerator.
// Compile with riscv64-unknown-elf-gcc dummy_rocc_test.c
// Run with spike --extension=dummy_rocc pk a.out
#include <assert.h>
#include <stdio.h>
#include <stdint.h>
int main() {
uint64_t x = 1, y = 456, z = 0, a=2, b=3, c=4, d=5, e=6;
asm volatile ("custom0 x0, %0, 0, 2" : : "r"(x));
asm volatile ("custom0 x0, %0, 1, 2" : : "r"(a));
asm volatile ("custom0 x0, %0, 2, 2" : : "r"(b));
asm volatile ("custom0 x0, %0, 3, 2" : : "r"(c));
asm volatile ("custom0 x0, %0, 4, 2" : : "r"(d));
asm volatile ("custom0 x0, %0, 5, 2" : : "r"(e));
asm volatile ("custom0 %0, x0, 0, 3" : "=r"(z));
printf("The popped value of z 0 is:- %d \n",z);
asm volatile ("custom0 %0, x0, 1, 3" : "=r"(z));
printf("The popped value of z 1 is:- %d \n",z);
asm volatile ("custom0 %0, x0, 2, 3" : "=r"(z));
printf("The popped value of z 2 is:- %d \n",z);
asm volatile ("custom0 %0, x0, 3, 3" : "=r"(z));
printf("The popped value of z 3 is:- %d \n",z);
asm volatile ("custom0 %0, x0, 4, 3" : "=r"(z));
printf("The popped value of z 4 is:- %d \n",z);
asm volatile ("custom0 %0, x0, 5, 3" : "=r"(z));
printf("The popped value of z 5 is:- %d \n",z);
printf("success!\n");
}
so as you can see I'm trying to push a series of numbers from 1-6 but when i run this code on the zed board this is the output I'm getting is.
root@zynq:~# ./fesvr-zynq pk /sdcard/Custom\ elfs/rocc_fifo
The popped value of z 0 is:- 1
The popped value of z 1 is:- 0
The popped value of z 2 is:- 2
The popped value of z 3 is:- 2
The popped value of z 4 is:- 2
The popped value of z 5 is:- 2
ideally it should pop 6,5,4,3,2,1
I resolved this issue, when the stack's push signal is high, the pop signal must be kept low and vice versa else when we push along with the pop signal high we immediately almost simultaneously pop the value as well and hence the output is as it looks in the above logs. Took a long time for me to debug and hence the late repost