Search code examples
javaalgorithmgenetic-algorithm

My Java Genetic Algorithm implementation seems to center around a mid-tier fitness value instead of a high value


My genetic algorithm implements a class called gene that stores a 100 (trigit?) ternary string (originally had it in binary but tried ternary). I provide a method that evaluates the fitness of the gene (sum of all the numbers, so 222...222 would be optimal). I originally had fitness be equal to how well the gene matched up with a given sequence but changed it. I also provide a method to "mutate" the gene. With a certain probability, each entry in the gene can be mutated to one of 3 possible codons.

I start with an initial population. I find the fittest one, kill the others, replicate the fittest one ("perfect" mitosis, mutation occurs in the next step only) , and then mutate all the genes. I run this for many "epochs".

The maximum fitness is 200, but my algorithm seems to center around 90-110 as the highest fitness of each epoch. If I randomly generate starting genes, my highest fitness is somewhere around 120, but it swings back to 90-110 very quickly (within 1-2 epochs). Other GA implementations don't have this problem. What am I doing wrong? Java source code is pasted at the bottom.

    import java.util.Random;

    public class Gene {

int[] gene=new int[100];

public Gene() {

Random rand=new Random();
 for(int i=0; i<100; i++){

    gene[i]=rand.nextInt(3);
}
}

public int fitness(){

    int sum=0;
    for(int i=0; i<gene.length; i++){
        sum=sum+gene[i];
    }
    return sum;
}
public void mutate(){

    Random randP=new Random();
    Random randM=new Random();
    for(int i=0; i<gene.length; i++){
        if(randP.nextInt(1000)<1){gene[i]=randM.nextInt(3);}
    }


}
public int compareTo(Gene g){
    return this.fitness()-g.fitness();
}
public String toString(){
    return ""+this.fitness();
}
public void minFit(){
    for(int i=0; i<gene.length; i++){
        gene[i]=0;
    }
}
public void maxFit(){
    for(int i=0; i<gene.length; i++){
        gene[i]=2;
    }
}

    }



    public class Biome {



        public static void main(String[] args) {
        Gene[] species=new Gene[1000];
        for(int i=0; i<1000; i++){
    species[i]=new Gene();
    species[i].maxFit();
}
int epoch=0;
System.out.println("start");
while(epoch<1000){
    System.out.println(fittestGene(species));
    sort(species);
    for(int i=0; i<999; i++){species[i]=species[999];}
    for(int i=0; i<1000; i++){species[i].mutate();}
    epoch++;


}



}
public static Gene fittestGene(Gene[] g){
    int maxIndex=0;
    for(int i=0; i<g.length; i++){
        if(g[i].fitness()>g[maxIndex].fitness()){maxIndex=i;}
    }
    return g[maxIndex];
}
public static void swap(Gene[] g, int a, int b){
    Gene temp=g[a];
    g[a]=g[b];
    g[b]=temp;
}
public static void sort(Gene[] g){
    for(int i=0; i<g.length; i++){
        int minIndex=i;
        for(int j=i+1; j<g.length; j++){
         if(g[j].compareTo(g[minIndex])>0){swap(g, j, minIndex);}
        }
    }
}

    }

Solution

  • This is where the mistake seems to be:

    for(int i=0; i<999; i++){species[i]=species[999];}
    

    The array holds references to Gene objects. So after that line all the references in the array, point to the same Gene object.

    You'll only have 1 Gene and it's fitness value will consist of the sum of 100 integers evenly distributed between 0 and 2. So that's where the 90-110 value is coming from.

    To properly copy the gene, you need to define a copy constructor for Gene and use that:

    public class Gene {
        ...
        public Gene(Gene other) {
            this.gene = Arrays.copyOf(other.gene, other.gene.length);
        }
    }
    

    Then change the original line to:

    for(int i=0; i<999; i++){species[i]=new Gene(species[999]);}