Search code examples
c#cil

Understanding how ldstr gets string literal


I want to analyse the IL code of a simple c# method:

public static string test()
{
    return "hello";
}

When I call GetILAsByteArray I'm getting the following bytes:

    [0] 0x00    byte
    [1] 0x72    byte
    [2] 0x01    byte
    [3] 0x00    byte
    [4] 0x00    byte
    [5] 0x70    byte
    [6] 0x0a    byte
    [7] 0x2b    byte
    [8] 0x00    byte
    [9] 0x06    byte
    [10]0x2a    byte

The second opcode is ldstr.

How I understood ldstr loads a string from metadata and pushes it on the stack. (Description of ldstr from microsoft)

But how do I know which data is loaded from metadata? Tells me the following 0x01 that I have to take the data on index 1 from metadata or not? Or is ldstr followed by an int32? How should I interpret this bytes?


Solution

  • When you type hello string in your code, compiler writes this string into #US stream of PE (exe/dll) file.

    #US stream (user strings) - holds array of 16 bit Unicode strings and these strings are referenced directly by ldstr.

    Let's take your example 72 01 00 00 70, so in this case your string is located in offset 0x01 in #US stream.

    The #US stream starts with a null byte and each following entry begins with a 7 bit encoded integer (represents the size in bytes of following entry).

    More information can be found in Ecma-355 (I I.24.2.4 #US and #Blob heaps section)