I'm trying to profile some existing C code that uses large structs with many members, with the goal of refactoring it into a smaller cache-friendly core struct containing the most frequently-accessed members and a pointer to the colder data.
I want to come up with a way of monitoring the app for a few hours in a few use-cases and produce a report of how often each member in an instance of the struct was accessed.
The x86 debug registers would be ideal, but unfortunately I can only watch 4 addresses simultaneously and I need many more.
I was thinking I could temporarily make each member occupy a whole page of its own, mark all the pages as not-accessible, then set up a segfault handler to record each access before somehow (and this is the tricky bit) recovering and allowing the app to continue. None of the memory being monitored is passed to a syscall, so there wouldn't be any issue with syscalls failing due to unreadable args. Is there a way to use the handler to temporarily make the page accessible, perform the faulting instruction, reprotect the page, then return?
Failing this, is there a more sensible way of recording accesses to many addresses? Something in valgrind maybe? Thanks
I was thinking I could temporarily make each member occupy a whole page of its own,
This only works for heap-allocated objects, and is what Electric Fence uses. In the past I've found the Electric Fence overhead so great that it's not usable for anything but toy programs.
Failing this, is there a more sensible way of recording accesses to many addresses? Something in valgrind maybe?
This is possible by writing a custom Valgrind tool, but that is a complicated proposition.
A better approach may be to use Pin tool instead.