Search code examples
rustmemorycompiler-construction

How to allocate memory when declaring String in Rust


When declaring the String type, I have a question about how memory is allocated in the stack and heap areas.

I think, the memory allocation image would be below.

let s = String::from("hello world");
(A)                                                (B)            
+======================+----- static memory -------+======================+ 
|          ...         |                           | "hello world"(12)    |
|                      |                           |          ...         |
+======================+------ stack memory -------+======================+
| s(24)                |                           | s(24)                |
| data addr            |                           | data addr            |
| length               |                           | length               |
| capacity             |                           | capacity             |
|          ...         |                           |          ...         |
|                      |                           |                      |
|          ...         |                           |                      |
| "hello world"(12)    |                           |          ...         |
+======================+------- heap memory -------+======================+

Which scenario is correct?

I understand that literal strings are stored in static area during compile time. However, the Rust documentation states that the String type is equivalent to Vec. The Vec type stores all its elements on the heap, so I'm confused as to whether scenario A is correct.


Solution

  • Neither is correct.

    First off, "hello world" is only 11 bytes. Rust does not null-terminate strings.

    Essentially, "hello world" is a str stored in the binary's static memory, which is copied from static memory into heap memory. The String struct stores a pointer to the heap memory, the length of the string, and the capacity of the allocation.

    pseudo-code:

    static_memory = {
      contents: str = "hello world"=
      lit: &str = (&contents, 11)
    }
    
    def String_from_strref(text: &str) -> String:
      (pointer, length) = text
      heap_pointer = malloc(length)
      capacity = memcpy(heap_pointer, pointer, length)
      
      return (heap_pointer, length, capacity)
    
    let s = String_from_strref(lit)
    

    And after that executes, memory will look like:

    === stack memory ===+=== heap memory ===+===== static memory =====
     s(24)              |                   |          "hello world"
       pointer ----------> "hello world..." |          ^
       length = 11      |                   | lit(16)  |
       capacity >= 11   |                   |   pointer
                        |                   |   length = 11
    

    Note: order of fields is for educational purposes only and not guaranteed by the layout of the various types