Search code examples
goprofilingbenchmarkingpprof

How to benchmark channel / mutex memory consumption / allocation?


I try to compare using channel for one value vs using mutex. Channel example:

func BenchmarkNewDummy(b *testing.B) {
    one := make(chan string, 1)
    two := make(chan string, 1)
    var wg sync.WaitGroup
    wg.Add(2)
    go doWork(&wg, one)
    go doWork(&wg, two)
    wg.Wait()
    fmt.Println(<-one)
    fmt.Println(<-two)
}

func doWork(wg *sync.WaitGroup, one chan string) {
    defer wg.Done()
    one <- "hell0"
}

command:

go test -bench=. -benchmem -run BenchmarkNewDummy -cpuprofile cpuCh.out -memprofile memCh.prof 

output doesn't provide any useful info

goos: darwin
goarch: amd64
BenchmarkNewDummy-8     hell0
hell0
hell0
hell0
hell0
hell0
hell0
hell0
hell0
hell0
2000000000               0.00 ns/op            0 B/op          0 allocs/op
PASS
ok        0.508s

with mutex situation almost the same:

func BenchmarkNewDummy(b *testing.B) {
    one := ""
    two := ""
    var wg sync.WaitGroup
    wg.Add(2)
    var mu sync.Mutex
    go func() {
        mu.Lock()
        defer mu.Unlock()
        defer wg.Done()
        one = "hello"
    }()
    go func() {
        mu.Lock()
        defer mu.Unlock()
        defer wg.Done()
        two = "hello"
    }()
    wg.Wait()
    fmt.Println(one)
    fmt.Println(two)
}

output:

goos: darwin
goarch: 
BenchmarkNewDummy-8     hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
2000000000               0.00 ns/op            0 B/op          0 allocs/op
PASS
ok        0.521s

memory graph looks almost the same but with mutext bigger memory allocation, but not informative as well: enter image description hereenter image description here

Are the any to compare channel and mutex memory consumption?


Solution

  • You're doing the benchmarking wrong. Quoting from package doc of testing:

    A sample benchmark function looks like this:

    func BenchmarkHello(b *testing.B) {
        for i := 0; i < b.N; i++ {
            fmt.Sprintf("hello")
        }
    }
    

    The benchmark function must run the target code b.N times. During benchmark execution, b.N is adjusted until the benchmark function lasts long enough to be timed reliably.

    Also don't include fmt.PrintXX() calls in benchmarked code, you distort the results.

    Benchmark these functions rather:

    func newDummy() {
        one := make(chan string, 1)
        two := make(chan string, 1)
        var wg sync.WaitGroup
        wg.Add(2)
        go doWork(&wg, one)
        go doWork(&wg, two)
        wg.Wait()
        <-one
        <-two
    }
    
    func doWork(wg *sync.WaitGroup, one chan string) {
        defer wg.Done()
        one <- "hell0"
    }
    
    func newDummy2() {
        one, two := "", ""
        var wg sync.WaitGroup
        wg.Add(2)
        var mu sync.Mutex
        go func() {
            mu.Lock()
            defer mu.Unlock()
            defer wg.Done()
            one = "hello"
        }()
        go func() {
            mu.Lock()
            defer mu.Unlock()
            defer wg.Done()
            two = "hello"
        }()
        wg.Wait()
        _, _ = one, two
    }
    

    Like this:

    func BenchmarkNewDummy(b *testing.B) {
        for i := 0; i < b.N; i++ {
            newDummy()
        }
    }
    
    func BenchmarkNewDummy2(b *testing.B) {
        for i := 0; i < b.N; i++ {
            newDummy2()
        }
    }
    

    Benchmarking it with:

    go test -bench . -benchmem
    

    I get an output like this:

    BenchmarkNewDummy-4    605662      1976 ns/op     240 B/op      5 allocs/op
    BenchmarkNewDummy2-4   927031      1627 ns/op      56 B/op      4 allocs/op
    

    From the results, newDummy() performs 5 allocations on average, totaling 250 bytes. newDummy2() performs 4 allocations, total of 56 bytes.