Search code examples
pointersgoslicebenchmarkingmemset

A faster or slower way to clear truncated pointers?


In a truncate implementation I've read recently, the author uses the following way to clear the truncated items:

var nilItems    = make(items, 16)

func (s *items) truncate(index int) {
    var toClear items
    *s, toClear = (*s)[:index], (*s)[index:]
    for len(toClear) > 0 {
        toClear = toClear[copy(toClear, nilItems):]
    }
}

When I need to clear unwanted items, I would just iterate over the slice and set items to nil one by one.

I have set up a simple benchmark and it seems that the for loop way is faster.

I wonder what's the benefit of clearing with copy in bulk.


Solution

  • As mentioned by @MartinGallagher, your loop is recognized and optimized by the compiler, while the copy() version does "too much stuff" and is not optimized.

    If you change your examples to fill with a non-nil pointer value, you'll see the loop version falls behind. Also don't allocate (make()) inside the benchmark loop, do that outside, and use b.ResetTimer() to exclude that time.

    You also have a very small slice, if you increase its size, the difference will be more noticable:

    var x = new(int)
    
    func BenchmarkSetNilOneByOne(b *testing.B) {
        nums := make([]*int, 12800)
        b.ResetTimer()
        for i := 0; i < b.N; i++ {
            for i := range nums {
                nums[i] = x
            }
        }
    }
    
    func BenchmarkSetNilInBulk(b *testing.B) {
        nils := make([]*int, 128)
        for i := range nils {
            nils[i] = x
        }
    
        orig := make([]*int, 12800)
        var nums []*int
        b.ResetTimer()
        for i := 0; i < b.N; i++ {
            nums = orig
            for len(nums) > 0 {
                nums = nums[copy(nums, nils):]
            }
        }
    }
    

    Benchmark results:

    BenchmarkSetNilOneByOne-4          96571         10626 ns/op
    BenchmarkSetNilInBulk-4           266690          4023 ns/op
    

    Also note that your "bulk" version also assigns slice headers to nums several times. There is a faster way to fill the slice: you do not need an additional "nils" slice, just start filling your slice, and you may copy the already filled part to the unfilled part. This also doesn't require to change / reassign to the nums slice header. See Is there analog of memset in go?