My genetic algorithm implements a class called gene that stores a 100 (trigit?) ternary string (originally had it in binary but tried ternary). I provide a method that evaluates the fitness of the gene (sum of all the numbers, so 222...222 would be optimal). I originally had fitness be equal to how well the gene matched up with a given sequence but changed it. I also provide a method to "mutate" the gene. With a certain probability, each entry in the gene can be mutated to one of 3 possible codons.
I start with an initial population. I find the fittest one, kill the others, replicate the fittest one ("perfect" mitosis, mutation occurs in the next step only) , and then mutate all the genes. I run this for many "epochs".
The maximum fitness is 200, but my algorithm seems to center around 90-110 as the highest fitness of each epoch. If I randomly generate starting genes, my highest fitness is somewhere around 120, but it swings back to 90-110 very quickly (within 1-2 epochs). Other GA implementations don't have this problem. What am I doing wrong? Java source code is pasted at the bottom.
import java.util.Random;
public class Gene {
int[] gene=new int[100];
public Gene() {
Random rand=new Random();
for(int i=0; i<100; i++){
gene[i]=rand.nextInt(3);
}
}
public int fitness(){
int sum=0;
for(int i=0; i<gene.length; i++){
sum=sum+gene[i];
}
return sum;
}
public void mutate(){
Random randP=new Random();
Random randM=new Random();
for(int i=0; i<gene.length; i++){
if(randP.nextInt(1000)<1){gene[i]=randM.nextInt(3);}
}
}
public int compareTo(Gene g){
return this.fitness()-g.fitness();
}
public String toString(){
return ""+this.fitness();
}
public void minFit(){
for(int i=0; i<gene.length; i++){
gene[i]=0;
}
}
public void maxFit(){
for(int i=0; i<gene.length; i++){
gene[i]=2;
}
}
}
public class Biome {
public static void main(String[] args) {
Gene[] species=new Gene[1000];
for(int i=0; i<1000; i++){
species[i]=new Gene();
species[i].maxFit();
}
int epoch=0;
System.out.println("start");
while(epoch<1000){
System.out.println(fittestGene(species));
sort(species);
for(int i=0; i<999; i++){species[i]=species[999];}
for(int i=0; i<1000; i++){species[i].mutate();}
epoch++;
}
}
public static Gene fittestGene(Gene[] g){
int maxIndex=0;
for(int i=0; i<g.length; i++){
if(g[i].fitness()>g[maxIndex].fitness()){maxIndex=i;}
}
return g[maxIndex];
}
public static void swap(Gene[] g, int a, int b){
Gene temp=g[a];
g[a]=g[b];
g[b]=temp;
}
public static void sort(Gene[] g){
for(int i=0; i<g.length; i++){
int minIndex=i;
for(int j=i+1; j<g.length; j++){
if(g[j].compareTo(g[minIndex])>0){swap(g, j, minIndex);}
}
}
}
}
This is where the mistake seems to be:
for(int i=0; i<999; i++){species[i]=species[999];}
The array holds references to Gene
objects. So after that line all the references in the array, point to the same Gene
object.
You'll only have 1 Gene
and it's fitness value will consist of the sum of 100 integers evenly distributed between 0 and 2. So that's where the 90-110 value is coming from.
To properly copy the gene, you need to define a copy constructor for Gene
and use that:
public class Gene {
...
public Gene(Gene other) {
this.gene = Arrays.copyOf(other.gene, other.gene.length);
}
}
Then change the original line to:
for(int i=0; i<999; i++){species[i]=new Gene(species[999]);}