Search code examples
c#.netcomcom-interop

Pass a buffer of chars from .NET to COM and get it back updated


I have the following COM method to be called from C#, which returns a string in the provided buffer pchText (which is not necessarily zero terminated) and the number of characters copied in pcch:

HRESULT Next([in, out] long* pcch, [out, size_is(*pcch)] OLECHAR* pchText);

How do I define the C# signature for interop?

So far, I tried this:

void Next(ref int pcch,
    [MarshalAs(UnmanagedType.LPWStr, SizeParamIndex = 0)]
    System.Text.StringBuilder pchText);

It appears to work, but I am not sure if SizeParamIndex has any effect on StringBuilder.


Solution

  • Well, it certainly is a difficult function to call correctly. Your declaration is roughly okay, you just need to apply the [PreserveSig] attribute and make the return value type int so you can discover an S_FALSE return value that indicates there is no next element.

    The difficulty is in having to guess up front how large the StringBuilder to pass. The native code gets a raw pointer into the GC heap, pointing to the builder buffer, so accidents are pretty fatal. You must guess up front at a proper Capacity for the builder and pass that as the initial pcch argument.

    The marshaller does pay attention to the SizeParamIndex after the function returns. It will only copy as many characters as ppch indicates. If it for any reason it writes more than can fit in the buffer then the program will instantly abort with an ExecutionEngineException since that indicates that the GC heap was corrupted.

    Do beware that if you guess at a Capacity that is too low then you can't necessarily discover this. You might just get a truncated string when the function copies only as many characters that fit and doesn't return an error code. Best way to find out if that's a problem is by just testing this and intentionally passing a small builder. Pay attention to the return value.

    One quirk is worth pointing out, the function signature hits at a hack that was common in the early days of COM, actually returning binary data instead of text through the OLECHAR*. Strong hint that's the case since the string isn't guaranteed to be zero-terminated. This will not come to a good end in .NET, the data will get corrupted when the string is normalized. And crash your program when data happens to match one of the utf-16 surrogate characters. If that's the case then you need a short[] instead of a StringBuilder.