Search code examples
goassemblydelve

golang doing unexpected heap memory allocation


While benchmarking, I noticed a surprising heap memory allocation. After reducing the repro, I ended up with the following:

// --- Repro file ---
func memAllocRepro(values []int) *[]int {

  for {
        break
    }

    return &values
}

// --- Benchmark file ---
func BenchmarkMemAlloc(b *testing.B) {

    values := []int{1, 2, 3, 4}

    for i := 0; i < b.N; i++ {
        memAllocRepro(values)
    }
}

And here is the benchmark output:

BenchmarkMemAlloc-4     50000000            40.2 ns/op        32 B/op          1 allocs/op
PASS
ok      memalloc_debugging  2.113s
Success: Benchmarks passed.

Now the funny this is, if I remove the for loop, or if I return the slice directly instead of a slice pointer, there are no more heap alloc:

// --- Repro file ---
func noAlloc1(values []int) *[]int {

    return &values // No alloc!
}

func noAlloc2(values []int) []int {
  for {
        break
    }

    return values // No alloc!
}

// --- Benchmark file ---
func BenchmarkNoAlloc(b *testing.B) {

    values := []int{1, 2, 3, 4}

    for i := 0; i < b.N; i++ {
        noAlloc1(values)
        noAlloc2(values)
    }

Benchmark result:

BenchmarkNoAlloc-4      300000000            4.20 ns/op        0 B/op          0 allocs/op
PASS
ok      memalloc_debugging  1.756s
Success: Benchmarks passed.

I found that very confusing and confirmed with Delve that the disassembly does has an allocation at the start of the memAllocRepro function:

(dlv) disassemble
TEXT main.memAllocRepro(SB) memalloc_debugging/main.go
        main.go:10      0x44ce10        65488b0c2528000000      mov rcx, qword ptr gs:[0x28]
        main.go:10      0x44ce19        488b8900000000          mov rcx, qword ptr [rcx]
        main.go:10      0x44ce20        483b6110                cmp rsp, qword ptr [rcx+0x10]
        main.go:10      0x44ce24        7662                    jbe 0x44ce88
        main.go:10      0x44ce26        4883ec18                sub rsp, 0x18
        main.go:10      0x44ce2a        48896c2410              mov qword ptr [rsp+0x10], rbp
        main.go:10      0x44ce2f        488d6c2410              lea rbp, ptr [rsp+0x10]
        main.go:10      0x44ce34        488d0525880000          lea rax, ptr [rip+0x8825]
        main.go:10      0x44ce3b        48890424                mov qword ptr [rsp], rax
=>      main.go:10      0x44ce3f*       e8bcebfbff              call 0x40ba00 runtime.newobject

I must say though, once I hit that point, I couldn't easily dig further. I'm pretty sure it would be possible to know at least which type is allocated by looking at the structure pointed to by the RAX register, but I wasn't very successful doing so. It's been a long time since I've read disassembly like this.

(dlv) regs
   Rip = 0x000000000044ce3f
   Rsp = 0x000000c042039f30
   Rax = 0x0000000000455660
   (...)

All that being said, I have 2 questions: * Anyone can tell why is there a heap allocation there and if it's "expected"? * How could I have gone further in my debugging session? Dumping memory to hex has a different address layout and go tool objdump will output disassembly, which mangle the content at the address location

Full function dump with go tool objdump:

TEXT main.memAllocRepro(SB) memalloc_debugging/main.go
  main.go:10        0x44ce10        65488b0c2528000000  MOVQ GS:0x28, CX            
  main.go:10        0x44ce19        488b8900000000      MOVQ 0(CX), CX              
  main.go:10        0x44ce20        483b6110        CMPQ 0x10(CX), SP           
  main.go:10        0x44ce24        7662            JBE 0x44ce88                
  main.go:10        0x44ce26        4883ec18        SUBQ $0x18, SP              
  main.go:10        0x44ce2a        48896c2410      MOVQ BP, 0x10(SP)           
  main.go:10        0x44ce2f        488d6c2410      LEAQ 0x10(SP), BP           
  main.go:10        0x44ce34        488d0525880000      LEAQ runtime.types+34656(SB), AX    
  main.go:10        0x44ce3b        48890424        MOVQ AX, 0(SP)              
  main.go:10        0x44ce3f        e8bcebfbff      CALL runtime.newobject(SB)      
  main.go:10        0x44ce44        488b7c2408      MOVQ 0x8(SP), DI            
  main.go:10        0x44ce49        488b442428      MOVQ 0x28(SP), AX           
  main.go:10        0x44ce4e        48894708        MOVQ AX, 0x8(DI)            
  main.go:10        0x44ce52        488b442430      MOVQ 0x30(SP), AX           
  main.go:10        0x44ce57        48894710        MOVQ AX, 0x10(DI)           
  main.go:10        0x44ce5b        8b052ff60600        MOVL runtime.writeBarrier(SB), AX   
  main.go:10        0x44ce61        85c0            TESTL AX, AX                
  main.go:10        0x44ce63        7517            JNE 0x44ce7c                
  main.go:10        0x44ce65        488b442420      MOVQ 0x20(SP), AX           
  main.go:10        0x44ce6a        488907          MOVQ AX, 0(DI)              
  main.go:16        0x44ce6d        48897c2438      MOVQ DI, 0x38(SP)           
  main.go:16        0x44ce72        488b6c2410      MOVQ 0x10(SP), BP           
  main.go:16        0x44ce77        4883c418        ADDQ $0x18, SP              
  main.go:16        0x44ce7b        c3          RET                 
  main.go:16        0x44ce7c        488b442420      MOVQ 0x20(SP), AX           
  main.go:10        0x44ce81        e86aaaffff      CALL runtime.gcWriteBarrier(SB)     
  main.go:10        0x44ce86        ebe5            JMP 0x44ce6d                
  main.go:10        0x44ce88        e85385ffff      CALL runtime.morestack_noctxt(SB)   
  main.go:10        0x44ce8d        eb81            JMP main.memAllocRepro(SB)      
  :-1           0x44ce8f        cc          INT $0x3

Disassemble of the memory pointed to by the RAX register:

(dlv) disassemble -a 0x0000000000455660 0x0000000000455860
        .:0     0x455660        1800                    sbb byte ptr [rax], al
        .:0     0x455662        0000                    add byte ptr [rax], al
        .:0     0x455664        0000                    add byte ptr [rax], al
        .:0     0x455666        0000                    add byte ptr [rax], al
        .:0     0x455668        0800                    or byte ptr [rax], al
        .:0     0x45566a        0000                    add byte ptr [rax], al
        .:0     0x45566c        0000                    add byte ptr [rax], al
        .:0     0x45566e        0000                    add byte ptr [rax], al
        .:0     0x455670        8e66f9                  mov fs, word ptr [rsi-0x7]
        .:0     0x455673        1b02                    sbb eax, dword ptr [rdx]
        .:0     0x455675        0808                    or byte ptr [rax], cl
        .:0     0x455677        17                      ?
        .:0     0x455678        60                      ?
        .:0     0x455679        0d4a000000              or eax, 0x4a
        .:0     0x45567e        0000                    add byte ptr [rax], al
        .:0     0x455680        c01f47                  rcr byte ptr [rdi], 0x47
        .:0     0x455683        0000                    add byte ptr [rax], al
        .:0     0x455685        0000                    add byte ptr [rax], al
        .:0     0x455687        0000                    add byte ptr [rax], al
        .:0     0x455689        0c00                    or al, 0x0
        .:0     0x45568b        004062                  add byte ptr [rax+0x62], al
        .:0     0x45568e        0000                    add byte ptr [rax], al
        .:0     0x455690        c0684500                shr byte ptr [rax+0x45], 0x0

Solution

  • Escape analysis determines whether any references to a value escape the function in which the value is declared.

    In Go, arguments are passed by value, typically on the stack; the stack is reclaimed at the end of the function. However, returning the reference &values from the memAllocRepro function gives the values parameter declared in memAllocRepro a lifetime beyond the end of the function. The values variable is moved to the heap.

    memAllocRepro: &values: Alloc

    ./escape.go:3:6: cannot inline memAllocRepro: unhandled op FOR
    ./escape.go:7:9: &values escapes to heap
    ./escape.go:7:9:    from ~r1 (return) at ./escape.go:7:2
    ./escape.go:3:37: moved to heap: values
    

    The noAlloc1 function is inlined in the main function. The values argument, if necessary, is declared in and does not escape from the main function.

    noAlloc1: &values: No Alloc

    ./escape.go:10:6: can inline noAlloc1 as: func([]int)*[]int{return &values}
    ./escape.go:23:10: inlining call to noAlloc1 func([]int)*[]int{return &values}
    

    The noAlloc2 function values argument is returned as values. values is returned on the stack. There is no reference to values in the noAlloc2 function and so no escape.

    noAlloc2: values: No Alloc


    package main
    
    func memAllocRepro(values []int) *[]int {
        for {
            break
        }
        return &values
    }
    
    func noAlloc1(values []int) *[]int {
        return &values
    }
    
    func noAlloc2(values []int) []int {
        for {
            break
        }
        return values
    }
    
    func main() {
        memAllocRepro(nil)
        noAlloc1(nil)
        noAlloc2(nil)
    }
    

    Output:

    $ go build -a -gcflags='-m -m' escape.go
    # command-line-arguments
    ./escape.go:3:6: cannot inline memAllocRepro: unhandled op FOR
    ./escape.go:10:6: can inline noAlloc1 as: func([]int) *[]int { return &values }
    ./escape.go:14:6: cannot inline noAlloc2: unhandled op FOR
    ./escape.go:21:6: cannot inline main: non-leaf function
    ./escape.go:23:10: inlining call to noAlloc1 func([]int) *[]int { return &values }
    ./escape.go:7:9: &values escapes to heap
    ./escape.go:7:9:    from ~r1 (return) at ./escape.go:7:2
    ./escape.go:3:37: moved to heap: values
    ./escape.go:11:9: &values escapes to heap
    ./escape.go:11:9:   from ~r1 (return) at ./escape.go:11:2
    ./escape.go:10:32: moved to heap: values
    ./escape.go:14:31: leaking param: values to result ~r1 level=0
    ./escape.go:14:31:  from ~r1 (return) at ./escape.go:18:2
    ./escape.go:23:10: main &values does not escape
    $