In glibc, the function _IO_new_fopen() is called by the fopen() libcall. If I am running the following code, is there any approach that allows me to intercept the _IO_new_fopen() function whenever it gets called and printout the values of its parameters?
For kernel functions, this can be achieved by Jprobe, and I am actually looking for a similar mechanism for the functions in glibc. LD_PRELOAD is a related mechnism in glibc that allows us to replace a glibc function with our self-defined function, but it does not help to achieve my goal.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main(void){
char buff[100];
int i, r1;
FILE *f1 = fopen("text.txt", "wb");
for(i = 0; i < 100; i++){
buff[i] = 'b';
}
assert(f1);
r1 = fwrite(buff, 1, 100, f1);
printf("wrote %d elements\n", r1);
fclose(f1);
}
In glibc
, fopen
is a [public] symbol alias to [local symbol] _IO_new_fopen
(i.e. they are identical), so, technically speaking, fopen
doesn't call _IO_new_fopen
--it is it.
If you are interested in intercepting the call as shown in your example (i.e. from the app), then using LD_PRELOAD
[along with dlopen
, dlsym
, etc.] and defining your own fopen
will work. I've done this before in my own code.
You may just be missing an entry point (i.e. you may need to define/intercept multiple symbols): fopen
, fopen64
, _IO_file_fopen
, _IO_fopen
, etc. If you do readelf -s
on your executable, the simple fopen
may show up as something else. I'm guessing that you'll see fopen64
. Also, you may need to account for symbol versioning.
You won't be able to intercept internal calls within glibc
to fopen
because they bypass the mechanism and go direct. But, this isn't quite so useful, so more details on your part might be required.
You can also look at fopencookie
as a way to intercept the underlying read(2)
, write(2)
, etc. syscalls.
UPDATE:
Specifically, I am trying to print the address and the content of the buffer that is used by fread()/fwrite().
Easy to do. Details below ...
I think it should be able to be done by adding "printk()" somewhere and rebuild libc.so.6, but is there any approach that is more convenient (i.e. without rebuilding libc.so.6)?
No need to rebuild libc
, the LD_PRELOAD
will handle it. Remember that printk
is in the kernel. No need to go there [and it probably wouldn't work]
The address you want is what's passed to fread
and not the buffer that's passed to read(2)
. The buffer passed to read(2)
is within the FILE
struct, so, it wouldn't tell you much. Otherwise, you could just run the whole program under strace(1)
[or write your own custom version that uses ptrace(2)
].
You can intercept, trace, breakpoint on, etc. an arbitrary number of functions. You just need to create your own shared library (.so
) and set LD_PRELOAD
to it.
Here is a sample interceptor function [for fread
]:
// trapme -- put gdb breakpoint on this function
__attribute__((__noinline__)) void
trapme(void)
{
// prevent the function call to this from being optimized away
__asm__ __volatile__ () ::: memory;
}
// dumpme -- dump out a buffer
void
dumpme(const void *buf,size_t xlen)
{
// dump data in whatever format you'd like ...
}
// fread -- intercept fread calls
size_t
fread(void *ptr, size_t size, size_t nmemb, FILE *stream)
{
static size_t
(*fread_real)(void *ptr,size_t size,size_t nmemb,FILE *stream) = NULL;
size_t xlen;
printf("fread_fake: ENTER ptr=%p size=%lX nmemb=%ld stream=%p\n",
ptr,size,nmemb,stream);
// do trap on some suspicious activity ...
#if 0
if (ptr == ...)
trapme();
#endif
// locate the real symbol in glibc
if (fread_real == NULL)
fread_real = dlsym(RTLD_NEXT,"fread");
// abort if fread_real is still null ...
xlen = fread_real(ptr,size,nmemb,stream);
// dump out the data
if (xlen > 0)
dumpme(ptr,xlen * size);
// do trap on some suspicious activity ...
#if 0
if (xlen == 372)
trapme();
#endif
printf("fread_fake: EXIT xlen=%ld\n",xlen);
return xlen;
}
That's the basic mechanism. You can add whatever [nefarious] things you desire. You can get fancy and add some logic that traps if a buffer is in some [bad] range or funny buffer contents. That is, similar to a cond
statement for a gdb
breakpoint. So, you can use this to trigger and drop into gdb
using much more complex tests than you could with gdb
alone [to find really hard to find bugs].
You can also monitor the fopen
and remember the filename, etc.
Getting this mechanism working isn't too difficult, but my suggestion would be to write the interceptor for fopen
and get familiar with the dlsym
trick first (vs. trying to debug it within a loop).
You can create as many of these functions as you need. Start simply (e.g. fopen
, fread
, fwrite
). Then, add more as you find a "gap" in the coverage (e.g. Eventually, you'll probably find that intercepting fseek
gives you needed information)
UPDATE #2:
Here is a sample strace
script with the options that I like to use:
strace -ttt -i -f -etrace=all -o \
/home/me/log/foobar.spysys \
-eread=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 \
-ewrite=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 \
-x foobar
The -eread
and -ewrite
can be extended to as many units as you need