c genetic-algorithm keyboard-layout roulette-wheel-selection

Efficient Implementation of Fitness-Proportionate "Roulette" Selection

I am currently writing a keyboard layout optimization algorithm in C (such as the one designed by Peter Klausler) and I want to implement a fitness-proportionate selection as described here (PDF Link):

With roulette selection you select members of the population based on a roullete wheel model. Make a pie chart, where the area of a member’s slice to the whole circle is the ratio of the members ﬁtness to the total population. As you can see if a point on the circumfrence of the circle is picked at random those population members with higher ﬁtness will have a higher probability of being picked. This ensures natural selection takes place.

The problem is, I don't see how to implement it efficiently. I've thought of two methods: one is unreliable, and the other is slow.

First, the slow one:

For a keyboard pool of length N, create an array of length N where each element of the array actually contains two elements, a minimum and a maximum value. Each keyboard has a corresponding minimum and maximum value, and the range is based on the fitness of the keyboard. For example, if keyboard zero has a fitness of 10, keyboard one has a fitness of 20, and keyboard two has a fitness of 25, it would look like this: Code:

array[0][0] = 0; // minimum
array[0][1] = 9; // maximum
array[1][0] = 10;
array[1][1] = 30;
array[2][0] = 31;
array[2][1] = 55;

(In this case a lower fitness is better, since it means less effort is required.)

Then generate a random number. For whichever range that number falls into, the corresponding keyboard is "killed" and replaced with the offspring of a different keyboard. Repeat this as many times as desired.

The problem with this is that it is very slow. It takes O(N^2) operations to finish.

Next the fast one:

First figure out what the lowest and highest fitnesses for the keyboards are. Then generate a random number between (lowest fitness) and (highest fitness) and kill all keyboards with a fitness higher than the generated number. This is efficient, but it's not guaranteed to only kill half the keyboards. It also has somewhat different mechanics from a "roulette wheel" selection, so it may not even be applicable.

So the question is, what is an efficient implementation?

There is a somewhat efficient algorithm on page 36 of this book (Link), but the problem is, it's only efficient if you do the roulette selection only one or a few times. Is there any efficient way to do many roulette selections in parallel?

Solution

For one thing, it sounds like you are talking about unfitness scores if you want to "kill off" your selection (which is likely to be a keyboard with high score).

I see no need to maintain two arrays. I think the simplest way is to maintain a single array of scores, which you then iterate through to make a choice:

/* These will need to be populated at the outset */
int scores[100];
int totalScore;

for (gen = 0; gen < nGenerations; ++gen) {
    /* Perform a selection and update */
    int r = rand() % totalScore;        /* HACK: using % introduces bias */
    int t = 0;
    for (i = 0; i < 100; ++i) {
        t += scores[i];
        if (r < t) {
            /* Bingo! */
            totalScore -= scores[i];
            keyboards[i] = generate_new_keyboard_somehow();
            scores[i] = score_keyboard(keyboards[i]);
            totalScore += scores[i];    /* Now totalScore is correct again */
        }
    }
}

Each selection/update takes O(n) time for n keyboards.