Search code examples
goslicememset

Is there analog of memset in go?


In C++ I can initialize an array with some value using memset:

const int MAX = 1000000;
int is_prime[MAX]

memset(is_prime, 1, sizeof(is_prime))

What memset does, crudely can be described as filling the array with some value, but doing this really really fast.

In go I can do is_prime := make([]int, 1000000), but this will create a slice with all 0, in the similar manner I can use new([1000000]int), but nothing will allow me to create an array/slice with all 1 or any other non-zero element.

Of course I can use a loop to populate it with the value later, but the main purpose of memset is that it is way way faster than the loop.

So do Go programmers have a memset analog (fast way of initializing array to some non-zero value)?


Solution

  • The simplest solution with a loop would look like this:

    func memsetLoop(a []int, v int) {
        for i := range a {
            a[i] = v
        }
    }
    

    There is no memset support in the standard library, but we can make use of the built-in copy() which is highly optimized.

    With repeated copy()

    We can set the first element manually, and start copying the already set part to the unset part using copy(); where the already set part gets bigger and bigger every time (doubles), so the number of iterations is log(n):

    func memsetRepeat(a []int, v int) {
        if len(a) == 0 {
            return
        }
        a[0] = v
        for bp := 1; bp < len(a); bp *= 2 {
            copy(a[bp:], a[:bp])
        }
    }
    

    This solution was inspired by the implementation of bytes.Repeat(). If you just want to create a new []byte filled with the same values, you can use the bytes.Repeat() function. You can't use that for an existing slice or slices other than []byte, for that you can use the presented memsetRepeat().

    In case of small slices memsetRepeat() may be slower than memsetLoop() (but in case of small slices it doesn't really matter, it will run in an instant).

    Due to using the fast copy(), memsetRepeat() will be much faster if the number of elements grows.

    Benchmarking these 2 solutions:

    var a = make([]int, 1000) // Size will vary
    
    func BenchmarkLoop(b *testing.B) {
        for i := 0; i < b.N; i++ {
            memsetLoop(a, 10)
        }
    }
    
    func BenchmarkRepeat(b *testing.B) {
        for i := 0; i < b.N; i++ {
            memsetRepeat(a, 11)
        }
    }
    

    Benchmark results

    100 elements: ~1.15 times faster

    BenchmarkLoop   20000000                81.6 ns/op
    BenchmarkRepeat 20000000                71.0 ns/op
    

    1,000 elements: ~2.5 times faster

    BenchmarkLoop    2000000               706 ns/op
    BenchmarkRepeat  5000000               279 ns/op
    

    10,000 elements: ~2 times faster

    BenchmarkLoop     200000              7029 ns/op
    BenchmarkRepeat   500000              3544 ns/op
    

    100,000 elements: ~1.5 times faster

    BenchmarkLoop      20000             70671 ns/op
    BenchmarkRepeat    30000             45213 ns/op
    

    The highest performance gain is around 3800-4000 elements where it is ~3.2 times faster.