(While Intel's forum is a more natural place to ask this question I'm posting it here hoping for more activity than Intel's total lack thereof -- so far)
I'm unable to create a dynamic link library that uses Intel Media SDK (linux server) to manipulate h264 video and noticed a problem in the design of the MFX library. The way I understand it, programs are supposed to link to static library, like:
$ g++ .... -L/opt/intel/mediasdk/lib/lin_x64 -lmfx
However, this libmfx.a
library appears to delegate all calls to a dlopen
ed dynamic library /opt/intel/mediasdk/lib64/libmfxhw64.so
. It is worth noting that function names (and signatures) exposed by static and dynamic libraries are identical, which is kind of confusing and dangerous.
While I don't understand the rationale behind this design, it should not be a problem by itself were it not that apparently some static/global initialization from within the library causes havoc when the (static) libmfx.a
is included in a shared object. Ie.:
+------+ +-----------+
| main | <-- | mylib.so |
+------+ | | +---------------+
| libmfx.a | (dlopen) | libmfxhw64.so |
| <------------- |
|+---------+| |+-------------+|
||MFXInit()|| || MFXInit() ||
||... || || ... ||
|| || || ||
+===========+ +===============+
The above library could be assembled like this:
$ g++ -shared -o mylib.so my1.o my2.o -lmfx
And then (dynamically) linked to main.o
like so:
$ g++ -o main main.o mylib.so -ldl
(Note that the additional libdl
is necessary to allow libmfx.a
to dlopen()
libmfxhw64.so
.)
Unfortunately, upon the first MFXInit()
call, the program causes a segmentation fault (accessing address 0x0000400). GDB backtrace:
#0 0x0000000000000400 in ?? ()
#1 0x00007ffff61fb4cd in MFXInit () from /opt/intel/mediasdk/lib64/libmfxhw64-p.so.1.13
#2 0x00007ffff7bd3a1f in MFX_DISP_HANDLE::LoadSelectedDLL(char const*, eMfxImplType, int, int) () from ./lib-a.so
#3 0x00007ffff7bd12b1 in MFXInit () from ./lib-a.so
#4 0x00007ffff7bd09c8 in test_mfx () at lib.c:12
#5 0x0000000000400744 in main (argc=1, argv=0x7fffffffe0d8) at main.c:8
(Observe that MFXInit()
at stackframe #3
is the one in libmfx.a
whereas the one at #1
is in libmfxhw64.so
.)
Note that there is no crash when mylib
is created as a static library. Using breakpoints and disassembler, I managed to make following backtrace snapshot where in both cases #1
is at MFXInit+424
, but they appear to hit different versions of MFXQueryVersion
(absolute addresses are meaningless due to relocation):
#0 0x00007ffff6411980 in MFXQueryVersion () from /opt/intel/mediasdk/lib64/libmfxhw64-p.so.1.13
#1 0x00007ffff640c4cd in MFXInit () from /opt/intel/mediasdk/lib64/libmfxhw64-p.so.1.13
#2 0x000000000040484f in MFX_DISP_HANDLE::LoadSelectedDLL(char const*, eMfxImplType, int, int) ()
#3 0x00000000004020e1 in MFXInit ()
#4 0x0000000000401800 in test_mfx () at lib.c:12
#5 0x0000000000401794 in main (argc=1, argv=0x7fffffffe0e8) at main.c:8
Because both static and shared Intel libs expose the same API functions, I can link straight into libmfxhw64.so
guts directly, but I suppose that bypassing the static "dispatcher" is without warranty(?)
Could someone explain Intel's idea behind said design? Spec., why provide a static library that only delegates to an .so
that has identical interface?
Also, it appears that the SEGV is caused by static/global data in either libmfx.a
or libmfxhw64.so
. Is there a way to force a specific execution order on dynamically loaded static/global sections? What is the best approach to debug these kinds of problems?
Tested with Intel Media SDK R2 (ubuntu 12) and Intel Media SDK 2015R3-R5 (Centos 7, 1.13/1.15) on Intel Haswell i7-4790 @3.6Ghz
If you have a working Intel MSDK setup, please compile my example code to confirm the issue.
(OK, since no one seems eager, I'll do the inelegant thing and post an answer to my own question).
After considerable research trying to break the unintentional circular linking, I discovered that the ld
option --exclude-libs
provides solace. Essentially, I was looking for a way to force removal of any libmfx.a
symbols after using them to resolve dependencies in lib.o
while creating the DLL. This could be accomplished by creating the so
like this:
g++ -shared -o lib-a.so lib.o -L/opt/intel/mediasdk/lib/lin_x64 -lmfx -Wl,--exclude-libs=libmfx
Once the library is created like this, Bob's you uncle:
g++ -o main-so-a main.o lib-a.so -ldl
(Note that libdl
is still needed because Intel's MFX (now inside lib-a.so
) still uses dlopen
to discover libmfxhw64.so
)
From the ld
man page:
--exclude-libs lib,lib,...
Specifies a list of archive libraries from which symbols should not be
automatically exported. The library names may be delimited by commas or
colons. Specifying "--exclude-libs ALL" excludes symbols in all archive
libraries from automatic export. This option is available only for the
i386 PE targeted port of the linker and for ELF targeted ports. For i386
PE, symbols explicitly listed in a .def file are still exported,
regardless of this option. For ELF targeted ports, symbols affected
by this option will be treated as hidden.
So, essentially the trick is no make sure that the relevant ELF symbols are marked hidden. Normally this would be handled through #pragma
s by the library developers (ie. Intel), but due to their negligence this needs to be retrofitted in this case.
I suppose the same could have been accomplished with a --version-script
map file, but that might have turned out to be more fragile since we want to fully encapsulate libmfx.a
anyway.