Search code examples
glibcdlopendlsym

Why calling dlopen sometimes breaks my application by damaging class variables content?


I am trying to load library with dlopen(). But call to this dlopen() function sometimes (not always) damages my class variables and then app goes to segmentation fault.

Below is not precise code (pseudocode), but explanation what happens:

class MyClass {
public:
   int MyVar;
   void Print() { printf("Simply breakpoint\n"); };
   void LoadLibrary() { dlopen("/usr/lib/x86_64-linux-gnu/libavcodec.so.58.54.100",RTLD_LAZY); };
   MyClass() {
      MyVar = 12345;
      printf("MyVar address %p\n",&MyVar);
      Print();
      LoadLibrary();
   };
}

void main()
{
MyClass obj;
}

I do debug it with gdb following way:

>gdb MyApp
>break Print
>run

when it stops at Print function breakpoint I see printed address of variable MyVar.

MyVar address 0x7fff900bc2bc

Also I can check its content. Then I do

>watch *0x7fff900bc2bc
Hardware watchpoint 2: *0x7fff900bc2bc
>cont

When it continues it breaks on unexpected writing to my variable MyVar:

Thread 1 "MyApp" hit Hardware watchpoint 2: *0x7fff900bc2bc

Old value = 12345
New value = 32767
memmove () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:356
356 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) backtrace
#0  memmove () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:356
#1  0x00007ffff7fde759 in _dl_map_object_deps (map=map@entry=0x7fff90145110, preloads=preloads@entry=0x0, 
    npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648)
    at dl-deps.c:446
#2  0x00007ffff7fe4db0 in dl_open_worker (a=a@entry=0x7fffa6fd80f0) at dl-open.c:571
#3  0x00007ffff53dd928 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, 
    args=<optimized out>) at dl-error-skeleton.c:208
#4  0x00007ffff7fe460a in _dl_open (file=0x42d8ee0 "/usr/lib/x86_64-linux-gnu/libavcodec.so.58.54.100", 
    mode=-2147483646, caller_dlopen=<optimized out>, nsid=-2, argc=2, argv=0x7fffffffea88, env=0x54037d0)
    at dl-open.c:837
#5  0x00007ffff57bc34c in dlopen_doit (a=a@entry=0x7fffa6fd8310) at dlopen.c:66
#6  0x00007ffff53dd928 in __GI__dl_catch_exception (exception=exception@entry=0x7fffa6fd82b0, 
    operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:208
#7  0x00007ffff53dd9f3 in __GI__dl_catch_error (objname=0x7fff900d8770, errstring=0x7fff900d8778, 
    mallocedp=0x7fff900d8768, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:227
#8  0x00007ffff57bcb59 in _dlerror_run (operate=operate@entry=0x7ffff57bc2f0 <dlopen_doit>, 
    args=args@entry=0x7fffa6fd8310) at dlerror.c:170
#9  0x00007ffff57bc3da in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#10 0x000000000209ec5b in MyClass::LoadLibrary() ()
.......

From stack backtrace I see that MyVar is damaged by call to dlopen() But why? What I am doing wrong? How to resolve?

Unfortunately I cannot show all source code because it is huge and involves many different components, many threads, many 3rd party libraries. I cannot simply dynamically link my app with libavcodec because it is already statically linked in 3rd party library but 3rd party library is built without required features unfortunately (without VAAPI support). Dynamic linking makes symbol conflicts. That is why I was decided try to load libavcodec manually by dlopen() and get all required function pointers from dlsym().


Solution

  • But why? What I am doing wrong? How to resolve?

    You didn't say which version of GLIBC you are using (or which distribution).

    The code in GLIBC-2.27 dl-deps.c reads:

          struct link_map **l_initfini = (struct link_map **)
            malloc ((2 * nneeded + 1) * sizeof needed[0]);
          if (l_initfini == NULL)
            _dl_signal_error (ENOMEM, map->l_name, NULL,
                      N_("cannot allocate dependency list"));
          l_initfini[0] = l;
          memcpy (&l_initfini[1], needed, nneeded * sizeof needed[0]);
          memcpy (&l_initfini[nneeded + 1], l_initfini,
              nneeded * sizeof needed[0]);                      // line 446
    

    You also didn't say whether MyClass is heap or stack allocated.

    One way that the GLIBC code could write over your variable is when you have already corrupted heap earlier. This is especially likely if MyClass is in fact heap-allocated (which it appears to be given the 0x7fff900bc2bc address).

    The fact that this "write over" happens only some of the time is also symptomatic of heap corruption.

    As the very first step, I would run the program under Valgrind and make sure that no heap corruption (heap buffer overflow, free unallocated, double-free, etc.) is detected before LoadLibrary() runs.