I am working on a kernel module in C to talk to a PCIe card and I have allocated some io memory using pci_iomap, and I write/read there using ioread/write32.
This works but the performance is quite poor, and I read I could use block transfer through memcpy_toio/fromio instead of just doing 32b at a time.
To write, I am using iowrite32(buffer[i], privdata->registers + i);
To read, I do buffer[i] = ioread32(&privdata->registers[i]);
I tried to replace the for loops these are in with:
memcpy_toio(privdata->registers, buffer, 2048);
memcpy_fromio(buffer, privdata->registers, 2048);
If I only replace the write loop with memcpy_toio and I do the reading using ioread32, the program doesn't crash but the instruction doesn't seem to be doing anything (registers don't change);
Also, when I replace the read loop as well with the memcpy_fromio instruction, it crashes.
I was thinking it might be because the reads try to access the mem location while it is still being written to. Is there a way to flush the writes queue after either iowrite32 or memcpy_toio?
What am I doing wrong here?
memcpy_from/toio()
can be used only if the I/O memory behaves like memory, i.e., if values can be read speculatively, and be written multiple times or out of order.
An I/O range marked as non-prefetchable does not support this.