Search code examples
pythonpandasmatplotlibgenetic-algorithmscipy-optimize

Plotting the Convergence Results of scipy.optimize.differential_evolution


I have two dataframes (df_1, df_2), some variables (A,B,C), a function (fun) and a global, genetic optimiser that finds the maximum value of fun for a given range of A,B,C.

from scipy.optimize import differential_evolution

df_1 = pd.DataFrame({'O' : [1,2,3], 'M' : [2,8,3]})

df_2 = pd.DataFrame({'O' : [1,1,1, 2,2,2, 3,3,3],
                     'M' : [9,2,4, 6,7,8, 5,3,4],
                     'X' : [2,4,6, 4,8,7, 3,1,9],
                     'Y' : [3,6,1, 4,6,5, 1,0,7],
                     'Z' : [2,4,8, 3,5,4, 7,5,1]})

# Index
df_1 = df_1.set_index('O')
df_1_M = df_1.M
df_1_M = df_1_M.sort_index()

# Fun
def fun(z, *params):
    A,B,C = z
        
    # Score
    df_2['S'] = df_2['X']*A + df_2['Y']*B + df_2['Z']*C
    
    # Top score
    df_Sort = df_2.sort_values(['S', 'X', 'M'], ascending=[False, True, True])
    df_O    = df_Sort.set_index('O')
    M_Top   = df_O[~df_O.index.duplicated(keep='first')].M
    M_Top   = M_Top.sort_index()
        
    # Compare the top scoring row for each O to df_1
    df_1_R = df_1_M.reindex(M_Top.index) # Nan
    T_N_T  = M_Top == df_1_R

    # Record the results for the given values of A,B,C
    df_Res = pd.DataFrame({'it_is':T_N_T}) # is this row of df_1 the same as this row of M_Top?
        
    # p_hat =         TP / (TP + FP)
    p_hat = df_Res.sum() / len(df_Res.index)
    
    print(z)
        
    return -p_hat[0]

# Bounds
min_ = 0
max_ = 1
ran_ge = (min_, max_)
bounds = [ran_ge,ran_ge,ran_ge]

# Params
params = (df_1, df_2)

# DE
DE = differential_evolution(fun, bounds, args=params)

It prints out [A B C] on each iteration, for example the last three rows are:

[0.04003901 0.50504249 0.56332845]
[0.040039   0.5050425  0.56332845]
[0.040039   0.50504249 0.56332846]

To see how it is converging, how can I plot A,B,C against iteration please?

I tried to store A,B,C in:

df_P = pd.DataFrame({0})

while adding to fun:

df_P.append(z)

but I got:

RuntimeError: The map-like callable must be of the form f(func, iterable), returning a sequence of numbers the same length as 'iterable'

Solution

  • So I am not sure to have found the best way, but I found one. It uses the fact that list are pass by reference. That means that if you pass the list to the function and modify it, it will be modified for the rest of the programme even if it is not returned by the function.

    # Params
    results = []  # this list will hold our restuts
    params = (df_1, df_2, results)  # add it to the params of the functions
    
    # now in the function add the output to the list, Instead of the mean here I used the distance to the origin (as if you 3 value were a 3d vector) 
    
    p_hat = df_Res.sum() / len(df_Res.index)
    
    distance_to_zeros = sum([e**2 for e in z]) ** 1/2
    results.append(distance_to_zeros)
    # Indeed you can also append z directly.
    
    # Then after DE call
    DE = differential_evolution(fun, bounds, args=params)
    
    x = range(0, len(results))
    
    plt.scatter(x, results, alpha=0.5)
    plt.show()