What is the point in setting a slice's capacity?
A slice is really just a fancy way to manage an underlying array. It automatically tracks size, and re-allocates new space as needed.
As you append to a slice, the runtime doubles its capacity every time it exceeds its current capacity. It has to copy all of the elements to do that. If you know how big it will be before you start, you can avoid a few copy operations and memory allocations by grabbing it all up front.
When you make
a slice providing capacity, you set tht initial capacity, not any kind of limit.
See this blog post on slices for some interesting internal details of slices.
A slice
is a wonderful abstraction of a simple array
. You get all sorts of nice features, but deep down at its core, lies an array
. (I explain the following in reverse order for a reason). Therefore, if/when you specify a capacity
of 3
, deep down, an array of length 3
is allocated in memory, which you can append
up to without having it need to reallocate memory. This attribute is optional in the make
command, but note that a slice
will always have a capacity
whether or not you choose to specify one. If you specify a length
(which always exists as well), the slice
be indexable up to that length. The rest of the capacity
is hidden away behind the scenes so it does not have to allocate an entirely new array when append
is used.
Here is an example to better explain the mechanics.
s := make([]int, 1, 3)
The underlying array
will be allocated with 3
of the zero value of int
(which is 0
):
[0,0,0]
However, the length
is set to 1
, so the slice itself will only print [0]
, and if you try to index the second or third value, it will panic
, as the slice
's mechanics do not allow it. If you s = append(s, 1)
to it, you will find that it has actually been created to contain zero
values up to the length
, and you will end up with [0,1]
. At this point, you can append
once more before the entire underlying array
is filled, and another append
will force it to allocate a new one and copy all the values over with a doubled capacity. This is actually a rather expensive operation.
Therefore the short answer to your question is that preallocating the capacity
can be used to vastly improve the efficiency of your code. Especially so if the slice
is either going to end up very large, or contains complex structs
(or both), as the zero
value of a struct
is effectively the zero
values of every single one of its fields
. This is not because it would avoid allocating those values, as it has to anyway, but because append
would have to reallocate new array
s full of these zero values each time it would need to resize the underlying array.
Short playground example: https://play.golang.org/p/LGAYVlw-jr
As others have already said, using the cap
parameter can avoid unnecessary allocations. To give a sense of the performance difference, imagine you have a []float64
of random values and want a new slice that filters out values that are not above, say, 0.5
.
Naive approach - no len or cap param
func filter(input []float64) []float64 {
ret := make([]float64, 0)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Better approach - using cap param
func filterCap(input []float64) []float64 {
ret := make([]float64, 0, len(input))
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Benchmarks (n=10)
filter 131 ns/op 56 B/op 3 allocs/op
filterCap 56 ns/op 80 B/op 1 allocs/op
Using cap
made the program 2x+ faster and reduced the number of allocations from 3 to 1. Now what happens at scale?
Benchmarks (n=1,000,000)
filter 9630341 ns/op 23004421 B/op 37 allocs/op
filterCap 6906778 ns/op 8003584 B/op 1 allocs/op
The speed difference is still significant (~1.4x) thanks to 36 fewer calls to runtime.makeslice
. However, the bigger difference is the memory allocation (~4x less).
Even better - calibrating the cap
You may have noticed in the first benchmark that cap
makes the overall memory allocation worse (80B vs 56B
). This is because you allocate 10 slots but only need, on average, 5 of them. This is why you don't want to set cap
unnecessarily high. Given what you know about your program, you may be able to calibrate the capacity. In this case, we can estimate that our filtered slice will need 50% as many slots as the original slice.
func filterCalibratedCap(input []float64) []float64 {
ret := make([]float64, 0, len(input)/2)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Unsurprisingly, this calibrated cap
allocates 50% as much memory as its predecessor, so that's ~8x improvement on the naive implementation at 1m elements.
Another option - using direct access instead of append
If you are looking to shave even more time off a program like this, initialize with the len
parameter (and ignore the cap parameter), access the new slice directly instead of using append, then throw away all the slots you don't need.
func filterLen(input []float64) []float64 {
ret := make([]float64, len(input))
var counter int
for _, el := range input {
if el > .5 {
ret[counter] = el
counter++
}
}
return ret[:counter]
}
This is ~10% faster than filterCap
at scale. However, in addition to being more complicated, this pattern does not provide the same safety as cap
if you try and calibrate the memory requirement.
- With
cap
calibration, if you underestimate the total capacity required, then the program will automatically allocate more when it needs it. - With this approach, if you underestimate the total
len
required, the program will fail. In this example, if you initialize asret := make([]float64, len(input)/2)
, and it turns out thatlen(output) > len(input)/2
, then at some point the program will try to access a non-existent slot and panic.