Parallel For-Loop
One common way to add controlled parallelism to a loop like this is to spawn a number of worker goroutines that will read tasks from a channel. The runtime.NumCPU
function may help in deciding how many workers it makes sense to spawn (make sure you set GOMAXPROCS
appropriately to take advantage of those CPUs though). You then simply write the jobs to the channel and they will be handled by the workers.
In this case where the job is to initialise elements of the population slice, so using a channel of *Individual
pointers might make sense. Something like this:
ch := make(chan *Individual)
for i := 0; i < nworkers; i++ {
go initIndividuals(individualSize, ch)
}
population := make([]Individual, populationSize)
for i := 0; i < len(population); i++ {
ch <- &population[i]
}
close(ch)
The worker goroutine would look something like this:
func initIndividuals(size int, ch <-chan *Individual) {
for individual := range ch {
// Or alternatively inline the createIndividual() code here if it is the only call
*individual = createIndividual(size)
}
}
Since tasks are not portioned out ahead of time, it doesn't matter if createIndividual
takes a variable amount of time: each worker will only take on a new task when the last is complete, and will exit when there are no tasks left (since the channel is closed at that point).
But how do we know when the job has completed? The sync.WaitGroup
type can help here. The code to spawn the worker goroutines can be modified like so:
ch := make(chan *Individual)
var wg sync.WaitGroup
wg.Add(nworkers)
for i := 0; i < nworkers; i++ {
go initIndividuals(individualSize, ch, &wg)
}
The initIndividuals
function is also modified to take the additional parameter, and add defer wg.Done()
as the first statement. Now a call to wg.Wait()
will block until all the worker goroutines have completed. You can then return the fully constructed population
slice.
So basically the goroutine should not return a value but push it down a channel. If you want to wait for all goroutines to finish you can just count to the number of goroutines, or use a WaitGroup. In this example it's an overkill because the size is known, but it's good practice anyway. Here's a modified example:
package main
import (
"math/rand"
"sync"
)
type Individual struct {
gene []bool
fitness int
}
func createPopulation(populationSize int, individualSize int) []Individual {
// we create a slice with a capacity of populationSize but 0 size
// so we'll avoid extra unneeded allocations
population := make([]Individual, 0, populationSize)
// we create a buffered channel so writing to it won't block while we wait for the waitgroup to finish
ch := make(chan Individual, populationSize)
// we create a waitgroup - basically block until N tasks say they are done
wg := sync.WaitGroup{}
for i := 0; i < populationSize; i++ {
//we add 1 to the wait group - each worker will decrease it back
wg.Add(1)
//now we spawn a goroutine
go createIndividual(individualSize, ch, &wg)
}
// now we wait for everyone to finish - again, not a must.
// you can just receive from the channel N times, and use a timeout or something for safety
wg.Wait()
// we need to close the channel or the following loop will get stuck
close(ch)
// we iterate over the closed channel and receive all data from it
for individual := range ch {
population = append(population, individual)
}
return population
}
func createIndividual(size int, ch chan Individual, wg *sync.WaitGroup) {
var individual = Individual{make([]bool, size), 0}
for i := 0; i < len(individual.gene); i++ {
if rand.Intn(2)%2 == 1 {
individual.gene[i] = true
} else {
individual.gene[i] = false
}
}
// push the population object down the channel
ch <- individual
// let the wait group know we finished
wg.Done()
}
For your specific problem you don't need to use channels at all.
However unless your createIndividual
spends some time doing calculations, the context switches between the goroutines is always gonna be much slower when run in parallel.
type Individual struct {
gene []bool
fitness int
}
func createPopulation(populationSize int, individualSize int) (population []*Individual) {
var wg sync.WaitGroup
population = make([]*Individual, populationSize)
wg.Add(populationSize)
for i := 0; i < populationSize; i++ {
go func(i int) {
population[i] = createIndividual(individualSize)
wg.Done()
}(i)
}
wg.Wait()
return
}
func createIndividual(size int) *Individual {
individual := &Individual{make([]bool, size), 0}
for i := 0; i < size; i++ {
individual.gene[i] = rand.Intn(2)%2 == 1
}
return individual
}
func main() {
numcpu := flag.Int("cpu", runtime.NumCPU(), "")
flag.Parse()
runtime.GOMAXPROCS(*numcpu)
pop := createPopulation(1e2, 21e3)
fmt.Println(len(pop))
}
Output:
┌─ oneofone@Oa [/tmp]
└──➜ go build blah.go; xtime ./blah -cpu 1
100
0.13u 0.00s 0.13r 4556kB ./blah -cpu 1
┌─ oneofone@Oa [/tmp]
└──➜ go build blah.go; xtime ./blah -cpu 4
100
2.10u 0.12s 0.60r 4724kB ./blah -cpu 4