How to efficiently concatenate strings in go
New Way:
From Go 1.10 there is a strings.Builder
type, please take a look at this answer for more detail.
Old Way:
Use the bytes
package. It has a Buffer
type which implements io.Writer
.
package main
import (
"bytes"
"fmt"
)
func main() {
var buffer bytes.Buffer
for i := 0; i < 1000; i++ {
buffer.WriteString("a")
}
fmt.Println(buffer.String())
}
This does it in O(n) time.
In Go 1.10+ there is strings.Builder
, here.
A Builder is used to efficiently build a string using Write methods. It minimizes memory copying. The zero value is ready to use.
Example
It's almost the same with bytes.Buffer
.
package main
import (
"strings"
"fmt"
)
func main() {
// ZERO-VALUE:
//
// It's ready to use from the get-go.
// You don't need to initialize it.
var sb strings.Builder
for i := 0; i < 1000; i++ {
sb.WriteString("a")
}
fmt.Println(sb.String())
}
Click to see this on the playground.
Supported Interfaces
StringBuilder's methods are being implemented with the existing interfaces in mind. So that you can switch to the new Builder type easily in your code.
- Grow(int) -> bytes.Buffer#Grow
- Len() int -> bytes.Buffer#Len
- Reset() -> bytes.Buffer#Reset
- String() string -> fmt.Stringer
- Write([]byte) (int, error) -> io.Writer
- WriteByte(byte) error -> io.ByteWriter
- WriteRune(rune) (int, error) -> bufio.Writer#WriteRune - bytes.Buffer#WriteRune
- WriteString(string) (int, error) -> io.stringWriter
Differences from bytes.Buffer
It can only grow or reset.
It has a copyCheck mechanism built-in that prevents accidentially copying it:
func (b *Builder) copyCheck() { ... }
In
bytes.Buffer
, one can access the underlying bytes like this:(*Buffer).Bytes()
.strings.Builder
prevents this problem.- Sometimes, this is not a problem though and desired instead.
- For example: For the peeking behavior when the bytes are passed to an
io.Reader
etc.
bytes.Buffer.Reset()
rewinds and reuses the underlying buffer whereas thestrings.Builder.Reset()
does not, it detaches the buffer.
Note
- Do not copy a StringBuilder value as it caches the underlying data.
- If you want to share a StringBuilder value, use a pointer to it.
Check out its source code for more details, here.
If you know the total length of the string that you're going to preallocate then the most efficient way to concatenate strings may be using the builtin function copy
. If you don't know the total length before hand, do not use copy
, and read the other answers instead.
In my tests, that approach is ~3x faster than using bytes.Buffer
and much much faster (~12,000x) than using the operator +
. Also, it uses less memory.
I've created a test case to prove this and here are the results:
BenchmarkConcat 1000000 64497 ns/op 502018 B/op 0 allocs/op
BenchmarkBuffer 100000000 15.5 ns/op 2 B/op 0 allocs/op
BenchmarkCopy 500000000 5.39 ns/op 0 B/op 0 allocs/op
Below is code for testing:
package main
import (
"bytes"
"strings"
"testing"
)
func BenchmarkConcat(b *testing.B) {
var str string
for n := 0; n < b.N; n++ {
str += "x"
}
b.StopTimer()
if s := strings.Repeat("x", b.N); str != s {
b.Errorf("unexpected result; got=%s, want=%s", str, s)
}
}
func BenchmarkBuffer(b *testing.B) {
var buffer bytes.Buffer
for n := 0; n < b.N; n++ {
buffer.WriteString("x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); buffer.String() != s {
b.Errorf("unexpected result; got=%s, want=%s", buffer.String(), s)
}
}
func BenchmarkCopy(b *testing.B) {
bs := make([]byte, b.N)
bl := 0
b.ResetTimer()
for n := 0; n < b.N; n++ {
bl += copy(bs[bl:], "x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); string(bs) != s {
b.Errorf("unexpected result; got=%s, want=%s", string(bs), s)
}
}
// Go 1.10
func BenchmarkStringBuilder(b *testing.B) {
var strBuilder strings.Builder
b.ResetTimer()
for n := 0; n < b.N; n++ {
strBuilder.WriteString("x")
}
b.StopTimer()
if s := strings.Repeat("x", b.N); strBuilder.String() != s {
b.Errorf("unexpected result; got=%s, want=%s", strBuilder.String(), s)
}
}