What is the point in setting a slice's capacity?

A slice is really just a fancy way to manage an underlying array. It automatically tracks size, and re-allocates new space as needed.

As you append to a slice, the runtime doubles its capacity every time it exceeds its current capacity. It has to copy all of the elements to do that. If you know how big it will be before you start, you can avoid a few copy operations and memory allocations by grabbing it all up front.

When you make a slice providing capacity, you set tht initial capacity, not any kind of limit.

See this blog post on slices for some interesting internal details of slices.

A slice is a wonderful abstraction of a simple array. You get all sorts of nice features, but deep down at its core, lies an array. (I explain the following in reverse order for a reason). Therefore, if/when you specify a capacity of 3, deep down, an array of length 3 is allocated in memory, which you can append up to without having it need to reallocate memory. This attribute is optional in the make command, but note that a slice will always have a capacity whether or not you choose to specify one. If you specify a length (which always exists as well), the slice be indexable up to that length. The rest of the capacity is hidden away behind the scenes so it does not have to allocate an entirely new array when append is used.

Here is an example to better explain the mechanics.

s := make([]int, 1, 3)

The underlying array will be allocated with 3 of the zero value of int (which is 0):

[0,0,0]

However, the length is set to 1, so the slice itself will only print [0], and if you try to index the second or third value, it will panic, as the slice's mechanics do not allow it. If you s = append(s, 1) to it, you will find that it has actually been created to contain zero values up to the length, and you will end up with [0,1]. At this point, you can append once more before the entire underlying array is filled, and another append will force it to allocate a new one and copy all the values over with a doubled capacity. This is actually a rather expensive operation.

Therefore the short answer to your question is that preallocating the capacity can be used to vastly improve the efficiency of your code. Especially so if the slice is either going to end up very large, or contains complex structs (or both), as the zero value of a struct is effectively the zero values of every single one of its fields. This is not because it would avoid allocating those values, as it has to anyway, but because append would have to reallocate new arrays full of these zero values each time it would need to resize the underlying array.

Short playground example: https://play.golang.org/p/LGAYVlw-jr

As others have already said, using the cap parameter can avoid unnecessary allocations. To give a sense of the performance difference, imagine you have a []float64 of random values and want a new slice that filters out values that are not above, say, 0.5.

Naive approach - no len or cap param

func filter(input []float64) []float64 {
    ret := make([]float64, 0)
    for _, el := range input {
        if el > .5 {
            ret = append(ret, el)
        }
    }
    return ret
}

Better approach - using cap param

func filterCap(input []float64) []float64 {
    ret := make([]float64, 0, len(input))
    for _, el := range input {
        if el > .5 {
            ret = append(ret, el)
        }
    }
    return ret
}

Benchmarks (n=10)

filter     131 ns/op    56 B/op  3 allocs/op
filterCap   56 ns/op    80 B/op  1 allocs/op

Using cap made the program 2x+ faster and reduced the number of allocations from 3 to 1. Now what happens at scale?

Benchmarks (n=1,000,000)

filter     9630341 ns/op    23004421 B/op    37 allocs/op
filterCap  6906778 ns/op     8003584 B/op     1 allocs/op

The speed difference is still significant (~1.4x) thanks to 36 fewer calls to runtime.makeslice. However, the bigger difference is the memory allocation (~4x less).

Even better - calibrating the cap

You may have noticed in the first benchmark that cap makes the overall memory allocation worse (80B vs 56B). This is because you allocate 10 slots but only need, on average, 5 of them. This is why you don't want to set cap unnecessarily high. Given what you know about your program, you may be able to calibrate the capacity. In this case, we can estimate that our filtered slice will need 50% as many slots as the original slice.

func filterCalibratedCap(input []float64) []float64 {
    ret := make([]float64, 0, len(input)/2)
    for _, el := range input {
        if el > .5 {
            ret = append(ret, el)
        }
    }
    return ret
}

Unsurprisingly, this calibrated cap allocates 50% as much memory as its predecessor, so that's ~8x improvement on the naive implementation at 1m elements.

Another option - using direct access instead of append

If you are looking to shave even more time off a program like this, initialize with the len parameter (and ignore the cap parameter), access the new slice directly instead of using append, then throw away all the slots you don't need.

func filterLen(input []float64) []float64 {
    ret := make([]float64, len(input))
    var counter int
    for _, el := range input {
        if el > .5 {
            ret[counter] = el
            counter++
        }
    }
    return ret[:counter]
}

This is ~10% faster than filterCap at scale. However, in addition to being more complicated, this pattern does not provide the same safety as cap if you try and calibrate the memory requirement.

With cap calibration, if you underestimate the total capacity required, then the program will automatically allocate more when it needs it.
With this approach, if you underestimate the total len required, the program will fail. In this example, if you initialize as ret := make([]float64, len(input)/2), and it turns out that len(output) > len(input)/2, then at some point the program will try to access a non-existent slot and panic.

What is the point in setting a slice's capacity?

Tags:

Slice

Go

Related

Recent Posts