Search code examples
c.netgccstatic-librariesnative-aot

Correct way to link lssl and lcrypto for a static lib


I'm trying to build a static .a lib in C that would sha256 encrypt a string. I will access the lib from a C# NativeAOT program. Before adding the sha256 encryption code I used the following commands to create the C lib

gcc -c -o out.o test.c
ar rcs libcrypto.a out.o

Then for the C# program;

dotnet publish -r linux-x64 -c Release -p:PublishAot=true

After adding the sha256 code to the C file, the .net linker started to complain about undefined reference to SHA256_Init etc..

I tried several ways of linking but none seemed to work;

  1. gcc -c -o out.o test.c -l -lssl -lcrypto -nostartfiles

Same error

  1. gcc -o out.o test.c -lssl -lcrypto -nostartfiles

Would generate a usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000001070 warning and with the .net linker would generate /usr/bin/ld.bfd: cannot use executable file 'libcrypto.a(out.o)' as input to a link

Here's the sample C code

#include <stdio.h>
#include <string.h>
#include "openssl/sha.h"

void sha256_test (char* string, char output[80])
{
    unsigned char hash[SHA256_DIGEST_LENGTH];
    SHA256_CTX sha256;
    SHA256_Init(&sha256);
    SHA256_Update(&sha256, string, strlen(string));
    SHA256_Final(hash, &sha256);
    int i = 0;
    for (i = 0; i < SHA256_DIGEST_LENGTH; i++)
    {
        sprintf(output + (i * 2), "%02x", hash[i]);
    }
    output[64] = 0;
}

int TestCrypto(char* name)
{
    char output[1000];
    sha256_test(name, output);
    printf("Encyption is: %s\n", output);
    return 1;
}

What would be the correct way to link it up?

Any help would be appreciated!


Solution

  • A tangential caution first.

    This:

    ar rcs libcrypto.a out.o
    

    is something you shouldn't do. You're hijacking the name, libcrypto, of an upstream system library to create your own little static library. The well-known shared library libcrypto.so is evidently already installed on your system and maybe the static library libcrypto.a is too. Having done this, if you now go on to attempt linkages like:

    $ gcc -o prog ... -lcrypto ...
    

    and talk about them, we have to wonder whether the libcrypto being linked against is the real libcrypto or your imposter, and if we're lucky figure out the answer from contextual clues. That's what I had to do in the rest of your post. If you went on to actually put your libcrypto.a in one of the directories where the linker searches for libraries by default, s**t could hit the fan.

    So let's pretend the library you want to make is called libmycrypt.a

    Then...

    What you want to achieve is to create a static library libmycrypt.a to (the contents of) which libssl and libcrypto have been linked, so that a program can be linked against your libmycrypto.a without additional dependencies on libssl and libcrypto.

    You seem to appreciate that you can't link other libraries (either shared or static) into a static library simply because a static library isn't linked. It's not produced by the linker. As in your:

    ar rcs libcrypto.a out.o
    

    It's produced by the plain old archiver ar. It's a bunch of object files. You can input it to a linkage for the linker to extract (copy out) and link the object files it finds it needs.

    You therefore attempted to create a sort of object file, out.o, compiled from from your test.c, to which libssl and libcrypto have been linked, which you can archive into libmycrypt.a. You tried a couple of ways that don't work.

    Way #1

    $ gcc -c -o out.o test.c -l -lssl -lcrypto -nostartfiles
    

    (I presume the first -l is a typo. With no argument it has no effect)

    The gcc option -c means just compile, don't link. So all of the (meaningful) linkage options:

    -lssl -lcrypto -nostartfiles
    

    are ignored and no linkage is done. You just compiled an object file out.o as per:

    $ gcc -c -o out.o test.c
    

    Way #2

    $ gcc -o out.o test.c -lssl -lcrypto -nostartfiles
    

    In the absence of the option -c, or the option -shared ( = output a shared library), or the option -r ( = output a relocatable, partially linked object file), the linker will by default output a program. So this command attempts to link a program called (inappropriately) out.o that is dynamically linked with libssl and libcrypto (because dynamic linkage is the default), and that has no start files.

    The start files are the files belonging to the GCC toolchain that gcc by default links into a program to enable it start up. They provide the _start symbol where execution of a program starts, and the code to do things thereafter such as static initialization that must be done before main is called. A program can't run in the absence of this stuff unless you have expertly avoided or replaced it with other stuff you supply to the linkage. That's why the linkage warned you:

    warning: cannot find entry symbol _start; defaulting to 0000000000001070
    

    And the .net linkage failed with:

    /usr/bin/ld.bfd: cannot use executable file 'libcrypto.a(out.o)' as input to a link
    

    because the archive member libcrypto.a(out.o) is an executable (program), albeit a broken one, not an object file, and you cannot statically (or dynamically) link a program into a program.

    The successful way

    We need to note at this point that your test.c code as-is makes references to symbols defined in libcrypto, but not to any in libssl. You call functions that are declared in openssl/sha.h but they are defined in libcrypto. As-is, you have no dependency on libssl. Perhaps you will add libssl-dependent code later.

    Here is the way to create out.o in such a way that all definitions it requires from libcrypto have been statically linked into it:

    $ cat test.c
    #include <stdio.h>
    #include <string.h>
    #include "openssl/sha.h"
    
    void sha256_test (const char* string, char output[80])
    {
        unsigned char hash[SHA256_DIGEST_LENGTH];
        SHA256_CTX sha256;
        SHA256_Init(&sha256);
        SHA256_Update(&sha256, string, strlen(string));
        SHA256_Final(hash, &sha256);
        int i = 0;
        for (i = 0; i < SHA256_DIGEST_LENGTH; i++)
        {
            sprintf(output + (i * 2), "%02x", hash[i]);
        }
        output[64] = 0;
    }
    
    void TestCrypto(const char* name)
    {
        char output[1000];
        sha256_test(name, output);
        printf("Encyption is: %s\n", output);
    }
    

    (I've corrected char *string and char *name to const char *s respectively, and I've made TestCrypto a void function rather than returning 1 no matter what.)

    Compile that to test.o:

    $ gcc -Wall -Wextra -pedantic -c -o test.o test.c
    test.c: In function ‘sha256_test’:
    test.c:9:5: warning: ‘SHA256_Init’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
        9 |     SHA256_Init(&sha256);
          |     ^~~~~~~~~~~
    In file included from test.c:3:
    /usr/include/openssl/sha.h:73:27: note: declared here
       73 | OSSL_DEPRECATEDIN_3_0 int SHA256_Init(SHA256_CTX *c);
          |                           ^~~~~~~~~~~
    test.c:10:5: warning: ‘SHA256_Update’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
       10 |     SHA256_Update(&sha256, string, strlen(string));
          |     ^~~~~~~~~~~~~
    /usr/include/openssl/sha.h:74:27: note: declared here
       74 | OSSL_DEPRECATEDIN_3_0 int SHA256_Update(SHA256_CTX *c,
          |                           ^~~~~~~~~~~~~
    test.c:11:5: warning: ‘SHA256_Final’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
       11 |     SHA256_Final(hash, &sha256);
          |     ^~~~~~~~~~~~
    /usr/include/openssl/sha.h:76:27: note: declared here
       76 | OSSL_DEPRECATEDIN_3_0 int SHA256_Final(unsigned char *md, SHA256_CTX *c);
      
    

    (Note the deprecation warnings. Refer to OPENSSL_API_COMPAT about suppressing them.)

    Then partially link as follows:

    $ gcc -r -o out.o test.o -lcrypto
    

    What does that do?

    The -r option tells gcc to instruct the linker (the option is just passed straight through to the linker) to produce a single relocatable object file in which all symbol references have been statically resolved as far as possible to definitions provided by the input object files and input libraries - of which static versions only will be searched for: -lcrypto is resolved to libcrypto.a. No shared libraries will be used and the output file is a static object, in which undefined symbols that could not be statically resolved may survive without linkage failure.

    So when we've done that we see:

    $ nm -D out.o
    nm: out.o: no symbols
    

    out.o has no dynamic dependencies. And:

    $ nm out.o | grep SHA256
    0000000000000430 T SHA256_Final
    00000000000001d0 T SHA256_Init
    0000000000000420 T SHA256_Transform
    0000000000000220 T SHA256_Update
    

    all the referenced symbols from openssl/sha.h are statically defined.

    out.o still contains undefined references:

    $ nm --undefined out.o
                     U getenv
                     U _GLOBAL_OFFSET_TABLE_
                     U memcpy
                     U printf
                     U sprintf
                     U __stack_chk_fail
                     U strlen
                     
    

    which are references into libc, the C runtime. They'll be resolved in program linkages with the library because libc is linked to programs by default.

    Now we can put out.o into a static library:

    $ ar rcs libmycrypt.a out.o
    

    although there is little point in putting this single object file into a static library, as opposed to just calling the object file mycrypt.o and linking it in programs that need it.

    Then we can have a header file for the libary:

    $ cat mycrypt.h 
    #pragma once
    
    extern void sha256_test (const char* string, char output[80]);
    extern void TestCrypto(const char* name);
    

    (This assumes that sha256_test was meant to be a public function, rather than static).

    And we can link a program against the library:

    $ cat main.c
    #include <mycrypt.h>
    
    int main(void)
    {
        TestCrypto("There's no business like show business");
        return 0;
    }
    
    $ gcc -I. -c main.c
    $ gcc -o prog main.o -L. -lmycrypt
    

    with no -lcrypto dependency. And it runs:

    $ ./prog
    Encyption is: 0f08b3f4b994688abb72c68882ded7a7e0395b3203ca55a04256c5f3cd552fef
    

    So that's now you do it, but should you?

    I suggest not.

    Partial linking (gcc -r ...) is not intended for this purpose. It's intended to let builders who would otherwise face enormous monolithic linkages to break them down into more manageable partially linked parts, which can ultimately be combined in a small linkage.

    Partial linking is not what people do when they build a library that makes external references into libssl and or libcrypto, or any other well-known, publicly distributed upstream library. They just provide their library and with it provide the information to its users that their library introduces dependencies on libssl and libcrypto - which which anybody can easily install if they haven't already.

    Builders of well-known libraries for packaged distribution through a package-manager will provide for installation with their library a pkg-config file or a CMake-Packages package configuration file that a downstream build-system can query programmatically to extract the compilation and linkage options its needs to use the library - including the additional dependencies that it imposes. But you're probably not going to package your library. So if you just clearly document the fact that libmycrypt.a requires appending -lcrypto to the linkage, that is your due diligence done. If you add any code to libmycrypt.a that makes it depend on libssl, then that's another dependency to document.

    It is fine for libraries to introduce dependencies on other libraries. If you're on a Linux/unix-like system it's likely that it has hundreds or thousands of installed libraries and the only one that doesn't carry dependencies on other libraries is libc. The point of libraries is that they build on other libraries without needing to contain contents of the ones they build on, duplicating the same contents explosively.