Search code examples
c#winapiuser32

For C#, is there a down-side to using 'string' instead of 'StringBuilder' when calling Win32 functions such as GetWindowText?


Consider these two definitions for GetWindowText. One uses a string for the buffer, the other uses a StringBuilder instead:

[DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern int GetWindowText(IntPtr hWnd, StringBuilder lpString, int nMaxCount);

[DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern int GetWindowText(IntPtr hWnd, string lpString, int nMaxCount);

Here's how you call them:

var windowTextLength = GetWindowTextLength(hWnd);

// You can use either of these as they both work
var buffer = new string('\0', windowTextLength);
//var buffer = new StringBuilder(windowTextLength);

// Add 1 to windowTextLength for the trailing null character
var readSize = GetWindowText(hWnd, buffer, windowTextLength + 1);

Console.WriteLine($"The title is '{buffer}'");

They both seem to work correctly whether I pass in a string, or a StringBuilder. However, all the examples I've seen use the StringBuilder variant. Even PInvoke.net lists that one.

My guess is the thinking goes 'In C# strings are immutable, therefore use StringBuilder', but since we're poking down to the Win32 API and messing with the memory locations directly, and that memory buffer is for all intents and purposes (pre)allocated (i.e. reserved for, and being currently used by the string) by the nature of it being assigned a value at its definition, that restriction doesn't actually apply, hence string works just fine. But I'm wondering if that assumption is wrong.

I don't think so because if you test this by increasing the buffer by say 10, and change the character you're initializing it with to say 'A', then pass in that larger buffer size to GetWindowText, the string you get back is the actual title, right-padded with the ten extra 'A's that weren't overwritten, showing it did update that memory location of the earlier characters.

So provided you pre-initialize the strings, can't you do this? Could those strings ever 'move out from under you' while using them because the CLR is assuming they're immutable? That's what I'm trying to figure out.


Solution

  • If you pass a string to a function using P/Invoke, the CLR will assume the function will read the string. For efficiency, the string is pinned in memory and a pointer to the first character is passed to the function. No character data needs to be copied this way.

    Of course, the function can do whatever it wants to the data in the string, including modifying it.

    This means the function will overwrite the first few characters without issue, but buffer.Length will remain unchanged and you'll end up with the existing data at the end of the string still present in the string. .NET strings store their length in a field. They are also null-terminated, but the null terminator is only used as a convenience for interoperability with C code and has no effect in managed code.

    Using such a string wouldn't be convenient as unless you pre-defined the string's size to perfectly match where the null-terminated character will ultimately be once written, .NET's length field will be out of sync with the underlying data.

    Also, it's better this way, since changing the length of a string would certainly corrupt the CLR heap (the GC wouldn't be able to walk the objects). Strings and arrays are the only two object types that don't have a fixed size.

    On the other hand, if you pass a StringBuilder through P/Invoke, you're explicitly telling the marshaler the function is expected to write to the instance, and when you call ToString() on it, it does update the length based on the null-termination character and everything is perfectly in sync.

    Better use the right tool for the job. :)