In evolutionary-/genetic algorithms there are multiple recombination methods. Most of them suffer from a bias associated with the length of the chromosome (also called positional bias).
Uniform crossover and shuffle crossover can solve this problem. However, I don't understand the difference between the two, if in case of uniform crossover p(c)=0.5
Explanation
p(c)=0.5
every gene is a possible
crossover point. As both methods involve a complete randomization, I see no reason why the results should be different.
I wanted to know exactly therefore I wrote a little script to test both mechanisms. Here's some R-code, if you like to try it yourself
parent1 <- rep(0, 10000) # 10.000 genes in the chromosome - change at will
parent2 <- rep(1, length(parent1))
# Uniform crossover
offspring1_unif <- rep(-1, length(parent1)) # initialize offspring1_unif
offspring2_unif <- rep(-1, length(parent1)) # initialize offspring2_unif
for(i in 1:length(parent1)) {
if (runif(1) < 0.5) {
offspring1_unif[i] <- parent1[i]
offspring2_unif[i] <- parent2[i]
} else {
offspring1_unif[i] <- parent2[i]
offspring2_unif[i] <- parent1[i]
}
}
# Shuffle crossover
## Shuffle
shuffler <- seq(1, length(parent1))
shuffler <- sample(shuffler, length(parent1))
## perform the crossover
crossover_point <- sample(1:length(parent1), 1)
offspring1_shuffle <- rep(-1, length(parent1)) # initialize offspring1_shuffle
offspring2_shuffle <- rep(-1, length(parent1)) # initialize offspring2_shuffle
for(i in 1:length(shuffler)) {
if (i < crossover_point) {
offspring1_shuffle[shuffler[i]] <- parent1[shuffler[i]]
offspring2_shuffle[shuffler[i]] <- parent2[shuffler[i]]
} else {
offspring1_shuffle[shuffler[i]] <- parent2[shuffler[i]]
offspring2_shuffle[shuffler[i]] <- parent1[shuffler[i]]
}
}
mean(offspring1_unif) # 0.493
mean(offspring1_shuffle) # 0.3295
mean(offspring2_unif) # 0.507
mean(offspring2_shuffle) # 0.6705
sd(offspring1_unif) # 0.499976
sd(offspring1_shuffle) # 0.4700552
sd(offspring2_unif) # 0.499976
sd(offspring2_shuffle) # 0.4700552
The difference is what distribution the number of swaps has in both of the methods.
uniform cross-over:
we pick a gene to be swapped with a probability p independent from the other swaps, i.e. a Bernoulli experiment and we perform this Bernoulli experiment, for the whole chromosome, i.e. let's say for n genes, so the number of swaps will follow a Binomial distribution.
shuffle-cross-over:
we randomly shuffle the chromosome first(this is done mainly to avoid positional bias, i.e. to decouple the probability swaps from the position of the genes in the chromosome -- we do take care of this bias in the uniform case too. What's different from the uniform case, is that we pick only one cross-over point, and all elements on one side of this cross-over point will be swapped, so we swap any number of genes with an equal probability of 1/2. Therefore we avoid also the so called distributional bias, i.e. probability of number of swaps being different.