Search code examples
c#arrayscstructpinvoke

How should I pass an array of strings to a C library using P/Invoke?


I'm trying to P/Invoke from a C# application into a C library, sending over a struct that contains an array of strings. I do have control over the C library, and can change things if needed.

This is a one-way street: From C# to C, I don't need to observe modifications made to the struct on the C side (I'm also passing it by value instead of by reference, though I might change that later - first trying to solve the immediate problem).

My C struct looks like this at the moment:

// C
struct MyArgs {
    int32_t someArg;
    char** filesToProcess;
    int32_t filesToProcessLength;
};

In C#, I've replicated the struct as such:

// C#
public struct MyArgs
{
    public int someArg;

    [MarshalAs(UnmanagedType.ByValArray, ArraySubType = UnmanagedType.LPStr)]
    public string[] filesToProcess;

    public int filesToProcessLength;
}

And then pass it to the library:

// C#
[DllImport("myLib.so", EntryPoint = "myFunction", CallingConvention = CallingConvention.Cdecl)]
internal static extern bool myFunction(MyArgs args);

var myArgs = new MyArgs {
    someArg = 10,
    filesToProcess = new string[] { "one", "two", "three" }
};
myArgs.filesToProcessLength = myArgs.filesToProcess.Length;

Console.WriteLine(myFunction(myArgs));

Where I'm trying to consume it:

// C
bool myFunction(struct MyArgs args) {
    printf("Files to Process: %i\n", args.filesToProcessLength);
    for (int i = 0; i < args.filesToProcessLength; i++) {
        char* str = args.filesToProcess[i];
        printf("\t%i. %s\n", i, str);
    }
    return true;
}

This basically crashes the app. I get an output that says Files to Process: 3 but then the app just stops. If I change the for-loop to not try to access the string, it counts through the loop - so it seems I get some sort of access violation.

If I change my code to accept an array as part of the function arguments, it works:

// C
bool myFunction(struct MyArgs args, char** filesToProcess, int32_t filesToProcesLength) {
    printf("Files to Process: %i\n", filesToProcessLength);
    for (int i = 0; i < filesToProcessLength; i++) {
        char* str = filesToProcess[i];
        printf("\t%i. %s\n", i, str);
    }
    return true;
}


// C#
[DllImport("myLib.so", EntryPoint = "myFunction", CallingConvention = CallingConvention.Cdecl)]
internal static extern bool myFunction(MyArgs args, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.LPStr)] filesToProcess, int filesToProcessLength);

My initial thought is that because within a struct, I use ByValArray, which might be a pointer to a string array (so essentially a char***?), but even if I change the type to char*** in the struct and do a char** strArray = *args.filesToProcess, I get the same result/non-working crash.

As I am mostly a C# developer with some C knowledge, I am a bit at a loss here. What would be the best way to P/Invoke a collection of strings into a C library within a struct? As said, I can change the C library however I want, just prefer to keep it in a struct instead of adding a function argument.

If it matters, this is on Linux, using gcc 9.3.0, and this is just plain C, not C++.

Updates:

  • sizeof(args) is 24
  • Taking the addresses:
    • &args = ...560
    • &args.someArg = ...560
    • &args.filesToProcess = ...568
    • &args.filesToProcessLength = ...576
  • So args.filesToProcess is a single pointer to something - will try to dig to see what it's pointing to

Update 2: Looking at a memory dump taken with this code, it seems that the C# side isn't sending the Array in a way that I want it, I assume the ByValArray is the problem here.

  0000  6f 6e 65 00 00 00 00 00 00 00 00 00 00 00 00 00  one.............
  0010  50 44 fc 00 00 00 00 00 61 00 00 00 00 00 00 00  PD......a.......
  0020  53 00 79 00 73 00 74 00 65 00 6d 00 2e 00 53 00  S.y.s.t.e.m...S.
  0030  65 00 63 00 75 00 72 00 69 00 74 00 79 00 2e 00  e.c.u.r.i.t.y...
  0040  43 00 72 00 79 00 70 00 74 00 6f 00 67 00 72 00  C.r.y.p.t.o.g.r.
  0050  61 00 70 00 68 00 79 00 2e 00 4f 00 70 00 65 00  a.p.h.y...O.p.e.
  0060  6e 00 53 00 73 00 6c 00 00 00 98 1f dc 7f 00 00  n.S.s.l.........

So I'm getting the first array element, but after that it's just random garbage (it changes with every run) - so the C side is tentatively OK, but the C# side isn't.

Update 3: I've experimented a bit more and changed the C# side from a string array to an IntPtr and Marshal.UnsafeAddrOfPinnedArrayElement(filesToProcess, 0). On the C side, I now get the C# array, though of course, with the C# stuff to it and wrong encoding, but at least it shows that it's indeed a marshaling problem on the C# side.

  0000  90 0f 53 f7 27 7f 00 00 03 00 00 00 6f 00 6e 00  ..S.'.......o.n.
  0010  65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  e...............
  0020  90 0f 53 f7 27 7f 00 00 03 00 00 00 74 00 77 00  ..S.'.......t.w.
  0030  6f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  o...............
  0040  90 0f 53 f7 27 7f 00 00 05 00 00 00 74 00 68 00  ..S.'.......t.h.
  0050  72 00 65 00 65 00 00 00 00 00 00 00 00 00 00 00  r.e.e...........
  0060  90 0f 53 f7 27 7f 00 00 27 00 00 00 20 00 4d 00  ..S.'...'... .M.

I get the key issue: If I wanted to pass an array by value, the struct size is dynamic per call, and that's probably an issue. But passing a ByValArray seems to not be correct either. Probably would need to either use a fixed-size array, an IntPtr to an array, or forgo the struct and pass it as a function argument.

But as always, if someone has a better plan, I'm all ears :)


Solution

  • You can't use a variable size array in a struct, you have to marshal the whole thing manually, or use arguments which is much easier, especially in the (C# to C)-only way.

    If you want to use a struct for some reason, then you can do it like this:

    C side (I'm using Windows, you may have to adapt):

    struct MyArgs {
        int32_t someArg;
        char** filesToProcess;
        int32_t filesToProcessLength;
    };
    
    // I pass struct as reference, not value, but this is not relevant
    // I also use __stdcall which is quite standard on Windows
    extern "C" {
        __declspec(dllexport) bool __stdcall myFunction(struct MyArgs* pargs) {
            printf("Files to Process: %i\n", pargs->filesToProcessLength);
            for (int i = 0; i < pargs->filesToProcessLength; i++) {
                char* str = pargs->filesToProcess[i];
                printf("\t%i. %s\n", i, str);
            }
            return true;
        }
    }
    

    C# side:

    static void Main(string[] args)
    {
        var files = new List<string>();
        files.Add("hello");
        files.Add("world!");
    
        var elementSize = IntPtr.Size;
        var my = new MyArgs();
        my.filesToProcessLength = files.Count;
    
        // allocate the array
        my.filesToProcess = Marshal.AllocCoTaskMem(files.Count * elementSize);
        try
        {
            for (var i = 0; i < files.Count; i++)
            {
                // allocate each file
                // I use Ansi as you do although Unicode would be better (at least on Windows)
                var filePtr = Marshal.StringToCoTaskMemAnsi(files[i]);
    
                // write the file pointer to the array
                Marshal.WriteIntPtr(my.filesToProcess + elementSize * i, filePtr);
            }
    
            myFunction(ref my);
        }
        finally
        {
            // free each file pointer
            for (var i = 0; i < files.Count; i++)
            {
                var filePtr = Marshal.ReadIntPtr(my.filesToProcess + elementSize * i);
                Marshal.FreeCoTaskMem(filePtr);
            }
            // free the array
            Marshal.FreeCoTaskMem(my.filesToProcess);
        }
    }
    
    [StructLayout(LayoutKind.Sequential)]
    struct MyArgs
    {
        public int someArg;
        public IntPtr filesToProcess;
        public int filesToProcessLength;
    };
    
    // stdcall is the default calling convention
    [DllImport("MyProject.dll")]
    static extern bool myFunction(ref MyArgs args);