Search code examples
pythonmachine-learninggenetic-algorithm

Trying to make a genetic algorithm


I've been studying about Genetic Algorithms lately and I decided to make my own using Python. I'll share the working I have done, below.

These are some helping function that I made to use in my driver function:

Note: These functions are fine I believe, and can be used as it is.

# Generates Random Population
def generate_random_population(npop, limits=list(zip(np.zeros(5),np.ones(5))), ngenes=5):
  
  def new_ind():
    return [random.uniform(limits[i][0], limits[i][1]) for i in range(ngenes)]

  return np.array([new_ind() for n in range(npop)])


# Function to evaluate all individuals and give them a score
# fopt1 only has a minimum (unimodal) at x = (0,0, ..., 0) in which fopt1 = 0.
def fopt1(ind):
  
    x0 = [ind[len(ind)-1]]
    xlast = [ind[0]]
    working_array = np.concatenate((x0,ind,xlast))
    res = 0

    for j in range(1, len(ind)+1):
        res += (2*working_array[j-1] + (working_array[j]**2)*working_array[j+1] - working_array[j+1])**2

    return res

# Receives a certain population of individuals and an evaluation function (usually called * fitness function *) and returns an ordered list of tuples
def eval_pop(pop, f):
  # Returns a list of tuples in descending order of the goodness of f. Shape of tuples are (individual, score), e.g., ([2.3,0.004,1,8.2,6], 0.361).
  
    list = []
    
    for i in pop:
        j = (pop, f(pop))
        list.append(j)
    

    return list


# Function to produce a next generation of individuals is to select the pairs that will interbreed to have offspring
def couples_selection(ordered_pop, n_elitism):
    if len(ordered_pop) < 10:
        print("Error: population's size should be higher than 9")
        return
  
    len_a = int(len(ordered_pop)/10)
    len_b = len_a * 3
    len_c = len_a * 4

    a = np.ones(len_a) * 0.5 / len_a
    b = np.ones(len_b) * 0.3 / len_b
    c = np.ones(len_c) * 0.15 / len_c
    d = np.ones(len(ordered_pop) - len_a*8)
    d = d * 0.05 / len(d)

    prob = np.concatenate((a,b,c,d))
    indices = range(len(ordered_pop))
    selected_indices = [choice(indices, 2, p=prob) for i in range(len(ordered_pop) - n_elitism)]
    couples = [[ordered_pop[i1], ordered_pop[i2]] for [i1,i2] in selected_indices]
    return np.array(couples)

def mutate(ind, limits):
    # print("Mutating individual ", ind)
    factor = 1 + (0.2 * choice([-1,1], 1))
    gene_index = choice(range(len(ind)), 1)[0]
    mutated_val = ind.item(gene_index) * factor

    if mutated_val < limits[gene_index][0]:
        mutated_val = limits[gene_index][0]
    elif mutated_val > limits[gene_index][1]:
        mutated_val = limits[gene_index][1]

    ind[gene_index] = mutated_val

    return

def crossover(couple):
    ancestor1 = couple[0]
    ancestor2 = couple[1]

    c1, c2 = ancestor1.copy(), ancestor2.copy()
    
    pt = randint(1, len(ancestor1)-2)
    # perform crossover
    c1 = ancestor1[:pt] + ancestor2[pt:]
    c2 = ancestor2[:pt] + ancestor1[pt:]

    return [c1, c2]
  

def get_offspring(couples, mutp, limits):

    children = [crossover(couple) for couple in couples]
    mutation_roulette = [choice([True, False], 1, p=[mutp, 1-mutp]) for _ in children]
    children_roulette = list(zip(children, mutation_roulette))

    for child in children_roulette:
        if child[1][0]:
            mutate(child[0], limits) 
            # print("Mutated: ",child[0])

    return np.array([child[0] for child in children_roulette])

Problem:

When I run the following driver function with the following function call:

runGA(100, 5, list(zip(np.ones(5)*-2,np.ones(5)*2)), fopt13, 4, 0.4, 25)

def runGA(npop, ngenes, limits, fitness, nelitism, mutp, ngenerations):
    pop = generate_random_population(npop, limits, ngenes)
    sorted_pop_with_score = eval_pop(pop, fitness)
    new_pop = np.array([p[0] for p in sorted_pop_with_score])

    for g in range(ngenerations):

    # TO DO: Complete your GA!
    
        couples = couples_selection(new_pop, nelitism)
        popp = get_offspring(couples,mutp, limits)
        eval_pop_result = eval_pop(pop,fitness)
    
    
    # END OF TO DO
    
        print("Winner after generation", g, ":", eval_pop_result[0])

    print("Absolute winner:")
    return sorted_pop_with_score[0]

I'm getting this error in the crossover function:

ValueError                                Traceback (most recent call last)
<ipython-input-20-375adbb7b149> in <module>
----> 1 runGA(100, 5, list(zip(np.ones(5)*-2,np.ones(5)*2)), fopt13, 4, 0.4, 25)

<ipython-input-12-6619b9c7d476> in runGA(npop, ngenes, limits, fitness, nelitism, mutp, ngenerations)
      8     # TO DO: Complete your GA!
      9         couples = couples_selection(new_pop, nelitism)
---> 10         popp = get_offspring(couples,mutp, limits)
     11         eval_pop_result = eval_pop(pop,fitness)
     12 

<ipython-input-16-5e8ace236573> in get_offspring(couples, mutp, limits)
     34 def get_offspring(couples, mutp, limits):
     35 
---> 36     children = [crossover(couple) for couple in couples]
     37     mutation_roulette = [choice([True, False], 1, p=[mutp, 1-mutp]) for _ in children]
     38     children_roulette = list(zip(children, mutation_roulette))

<ipython-input-16-5e8ace236573> in <listcomp>(.0)
     34 def get_offspring(couples, mutp, limits):
     35 
---> 36     children = [crossover(couple) for couple in couples]
     37     mutation_roulette = [choice([True, False], 1, p=[mutp, 1-mutp]) for _ in children]
     38     children_roulette = list(zip(children, mutation_roulette))

<ipython-input-16-5e8ace236573> in crossover(couple)
     25     print(len(ancestor1))
     26     print(len(ancestor2))
---> 27     c1 = ancestor1[:pt] + ancestor2[pt:]
     28     c2 = ancestor2[:pt] + ancestor1[pt:]
     29 

ValueError: operands could not be broadcast together with shapes (39,5) (61,5) 

Note:

I also tried the np.concatenate function but it gives the following error on the same step: TypeError: only integer scalar arrays can be converted to a scalar index Any help would be highly appreciated!


Solution

  • My comments turned into an answer:

    So it looks like you need to run couples_selection() on the population for each generation, then run get_offspring() on the couples returned from couples_selection(), and then run eval_pop() on the population returned from get_offspring(). Then, the winner of that generation will be the individual from the returned list of eval_pop() that had the highest score. It looks like eval_pop() is supposed to sort its returned list in descending order of score, but doesn't appear to; otherwise, the [0] index of the returned list would be the one with the highest score, aka the winner.

    Also, if you're returning sorted_pop_with_score[0] as the absolute winner, then it seems like you need to be adding the winner of each generation to some list, and then run eval_pop() on that list after you complete all the generations, and set sorted_pop_with_score to the result of that final eval_pop().