I'm writing a framework to enable co-simulation of RTL running in a simulator and un-modified host software. The host software is written to control actual hardware and typically works in one of two ways:
The former case is pretty straightforward - write a library that implements the same read / write calls as the driver and link against that when running a simulation. This all works wonderfully and I can run un-modified production software as stimulus for my RTL simulations.
The second case is turning out to be far more difficult than the first...
Initially I thought I could use LD_PRELOAD
to intercept the mmap call. In my implementation of mmap
I'd allocate some page-aligned memory and then mprotect
it and set a signal handler to trap SIGSEGV
.
There are numerous problems with this approach:
Read vs Write
I can determine the address of the access from siginfo_t->si_addr
but not whether the access was read or write.
Catching repeat accesses
In the signal handler I need to un-protect the memory region, otherwise the I'll get repeat SIGSEGV
s as soon as my handler exits and the host code can never continue. However if I unprotect the region then my signal handler won't trap subsequent accesses.
Signal handler nastiness
To block in a signal handler while the simulator drives the RTL and returns a result violates all sorts of programming rules - particularly given the simulator could trigger all sorts of other events and execute arbitrary code before returning a result from this access.
I was wondering if it's possible to create a file-like object that behaves like a disk rather than using mprotect
on a buffer. I haven't found any information suggesting this is feasible.
Is it possible to trap all accesses to an mmap region and how?
Assuming LD_PRELOAD
and mprotect
is the best route:
mprotect
the region?On X86 you can set Trap flag for the caller's context to get SIGTRAP after one instruction (this flag is typically used for single-stepping). That is, when SIGSEGV is encountered, you set TF in the caller's EFLAGS (see ucontext.h
), enable reading with mprotect
and return. If SIGSEGV is repeated instantly with the same IP, you enable writing (and optionally disable reading, if you want to distinguish read-modify-write from write-only access). If you get SIGSEGV from the same IP for read-only and write-only protection, enable read-write.
Whenever you get SIGTRAP, you can analyze what value was written (if it was a write access), and you can also re-protect the page to trap future accesses.
Correction: if both reads and writes can have side-effects, try write-only protection first, then apply reading side-effects and try read-only protection, then enable writing and handle side-effects of writing in the final SIGTRAP handler.
UPDATE: I was deadly wrong on recommending hypothetical write-only protection which turns out not to exist on most architectures. Fortunately there's a more straightforward way to know whether the operation that failed tries to read memory, at least on x86:
Page fault exception pushes an error code to the stack, which is available in Linux SIGSEGV handler as the err
member of sigcontext
structure. Bit 1 of the error code is 1 for write faults and 0 otherwise. For read-modify-write operation, it will be 0 initially (here you can emulate reading, knowing exactly that it's going to happen).