I've implemented a genetic algorithm which uses a reproduction method based on the one described in Regularized Evolution for Image Classifier Architecture Search.
Minimal pseudo-code describing the reproduction method:
num_steps = 1e1000
step = 0
1. while step < total_steps:
1a. winner, loser = tournament_selection(population)
1b. offspring = mutate(winner)
1c. replace loser with offspring
1d. step++
The authors in 1 mention that the loop above is parallelized by distributing the loop above over multiple workers. They also mention that the full implementation is given in the released source, but the linked code does not contain the relevant parallelization part.
I fail to understand how this specific reproduction method can be parallelized: the parent selection step (1a) is dependent on the current state of the population.
Some notes:
You can imagine a parallelization scheme as follows (with n workers):
num_steps = 1000
num_cycles = num_steps / n
cycle = 0
while cycle < num_cycles:
children = []
for i in range(n):
children.append(start_worker(current_pop))
for i in range(n):
current_population.remove(oldest)
current_population.append(children)
cycle += 1
start_worker(pop):
winner, loser = tournament_selection(population)
offspring = mutate(winner)
offspring.fitness = compute_fitness(offspring)
return offspring
This way, at each step you create n tournaments and generate n children. I think the parallelization would be efficient since the computational cost often resides in the compute_fitness() method.
Using this method, you would reduce the selection pressure since more children are generated from the current population. This effect could be balanced by increasing the size of the tournament.
It could be interesting to compare the optimization performance on a toy problem with and without this parallelization scheme.