Search code examples
pythonoptimizationmystic

How to properly generate constraints for array equations in mystic?


I have an optimalization problem where I want to minimize the quantity -sum(p log(p))dx (entropy) with the constraints sum(p)dx == 1 and sum(p.x)dx == 0.2, where p and x are arrays and dx is a scalar. I tried to implement this using mystic this way:

from pylab import *
from mystic.symbolic import generate_constraint, generate_solvers, simplify
from mystic.solvers import diffev2

x = linspace(0, 5, 100)
dx = x[1]-x[0]

def objective(p):
    return -sum(p*log(p))*dx

bounds = [(0,1)]*len(x)

equations = '''
sum(p)*dx == 1
sum(p*x)*dx == 0.2
'''

eqns = simplify(equations, variables=["p"], locals={"dx":dx, "x":x})

cf = generate_constraint(generate_solvers(eqns))

res = diffev2(objective, x0=[1/len(x)]*len(x), bounds=bounds, constraints=cf)

print(sum(res)*dx)
print(sum(res*x)*dx)

But it obviously does not work well, because it returns:

Optimization terminated successfully.
         Current function value: 0.145182
         Iterations: 264
         Function evaluations: 26500
0.030782323656083955
0.07399192217757057

Thus constraints are violated. How should I proceed to solve my problem properly?


Solution

  • I'm the mystic author. The primary issue is that you've assumed mystic.symbolic can understand vector notation, and it can't.

    >>> from mystic.symbolic import generate_constraint, generate_solvers, simplify
    >>> from mystic.solvers import diffev2
    >>> from numpy import *
    >>> x = linspace(0, 5, 100)
    >>> dx = x[1]-x[0]
    >>> def objective(p):
    ...     return -sum(p*log(p))*dx
    ... 
    >>> bounds = [(0,1)]*len(x)
    >>> equations = '''
    ... sum(p)*dx == 1
    ... sum(p*x)*dx == 0.2
    ... '''
    >>> eqns = simplify(equations, variables=["p"], locals={"dx":dx, "x":x})
    >>> print(eqns)
    p == 0.0158400000000000
    p == 19.8000000000000
    >>>
    

    This is obviously not what you want. So, if you want to work with symbolic constraints, then you have to write it out element-by-element.

    >>> equations = 'dx * (' + ' + '.join('p%s' % i for i,j in enumerate(x)) + ') ==
     1\n'
    >>> equations += 'dx * (' + ' + '.join('p%s * x%s' % (i,i) for i,j in enumerate(x)) + ') == .2\n'
    >>> locals = {'x%s' % i:j for i,j in enumerate(x)}
    >>> locals['dx'] = dx
    >>> eqns = simplify(equations, variables=['p%s' % i for i,j in enumerate(x)], locals=locals)
    >>> eqns.split('\n',1)[0]
    'p10 == -0.1*p1 - 1.1*p11 - 1.2*p12 - 1.3*p13 - 1.4*p14 - 1.5*p15 - 1.6*p16 - 1.7*p17 - 1.8*p18 - 1.9*p19 - 0.2*p2 - 2.0*p20 - 2.1*p21 - 2.2*p22 - 2.3*p23 - 2.4*p24 - 2.5*p25 - 2.6*p26 - 2.7*p27 - 2.8*p28 - 2.9*p29 - 0.3*p3 - 3.0*p30 - 3.1*p31 - 3.2*p32 - 3.3*p33 - 3.4*p34 - 3.5*p35 - 3.6*p36 - 3.7*p37 - 3.8*p38 - 3.9*p39 - 0.4*p4 - 4.0*p40 - 4.1*p41 - 4.2*p42 - 4.3*p43 - 4.4*p44 - 4.5*p45 - 4.6*p46 - 4.7*p47 - 4.8*p48 - 4.9*p49 - 0.5*p5 - 5.0*p50 - 5.1*p51 - 5.2*p52 - 5.3*p53 - 5.4*p54 - 5.5*p55 - 5.6*p56 - 5.7*p57 - 5.8*p58 - 5.9*p59 - 0.6*p6 - 6.0*p60 - 6.1*p61 - 6.2*p62 - 6.3*p63 - 6.4*p64 - 6.5*p65 - 6.6*p66 - 6.7*p67 - 6.8*p68 - 6.9*p69 - 0.7*p7 - 7.0*p70 - 7.1*p71 - 7.2*p72 - 7.3*p73 - 7.4*p74 - 7.5*p75 - 7.6*p76 - 7.7*p77 - 7.8*p78 - 7.9*p79 - 0.8*p8 - 8.0*p80 - 8.1*p81 - 8.2*p82 - 8.3*p83 - 8.4*p84 - 8.5*p85 - 8.6*p86 - 8.7*p87 - 8.8*p88 - 8.9*p89 - 0.9*p9 - 9.0*p90 - 9.1*p91 - 9.2*p92 - 9.3*p93 - 9.4*p94 - 9.5*p95 - 9.6*p96 - 9.7*p97 - 9.8*p98 - 9.9*p99 + 7.8408'
    

    This is starting to look a bit more like what you want.

    >>> from mystic.constraints import and_
    >>> from mystic.monitors import VerboseMonitor
    >>> cf = generate_constraint(generate_solvers(eqns, variables=['p%s' % i for i,j in enumerate(x)]), join=and_)
    >>> res = diffev2(objective, x0=[1/len(x)]*len(x), bounds=bounds, constraints=cf, itermon=mon, npop=400)
    Generation 0 has ChiSquare: inf
    Generation 1 has ChiSquare: inf
    Generation 2 has ChiSquare: inf
    Generation 3 has ChiSquare: inf
    Generation 4 has ChiSquare: inf
    Generation 5 has ChiSquare: inf
    Generation 6 has ChiSquare: inf
    Generation 7 has ChiSquare: inf
    Generation 8 has ChiSquare: inf
    Generation 9 has ChiSquare: inf
    Generation 10 has ChiSquare: inf
    Generation 11 has ChiSquare: inf
    Generation 12 has ChiSquare: inf
    Generation 13 has ChiSquare: inf
    Generation 14 has ChiSquare: inf
    Generation 15 has ChiSquare: inf
    Generation 16 has ChiSquare: inf
    Generation 17 has ChiSquare: inf
    Generation 18 has ChiSquare: inf
    Generation 19 has ChiSquare: inf
    Generation 20 has ChiSquare: inf
    Generation 21 has ChiSquare: inf
    Generation 22 has ChiSquare: inf
    Generation 23 has ChiSquare: inf
    Generation 24 has ChiSquare: inf
    Generation 25 has ChiSquare: inf
    Generation 26 has ChiSquare: inf
    Generation 27 has ChiSquare: inf
    Generation 28 has ChiSquare: inf
    Generation 29 has ChiSquare: inf
    Generation 30 has ChiSquare: inf
    STOP("VTRChangeOverGeneration with {'ftol': 0.005, 'gtol': 1e-06, 'generations': 30, 'target': 0.0}")
    Optimization terminated successfully.
             Current function value: inf
             Iterations: 30
             Function evaluations: 12400
    

    You probably want a larger population than this, and as I'm seeing a lot of inf in the results, you probably want to go back to the constraints and add tight bounds. The latter can be accomplished by passing tight=True to the solver, but it's slower than adding the bounds explicitly when you first define the equations. Like this:

    >>> from mystic.symbolic import symbolic_bounds
    >>> equations_ = symbolic_bounds(*zip(*bounds))
    

    You'll then need to combine and simplify equations and equations_ to serve as the basis for your constraints function.

    As an alternative, you can use a penalty... which (once you have the simplified eons, builds like this:

    >>> from mystic.symbolic import generate_penalty, generate_conditions
    >>> from mystic.coupler import and_ as _and
    >>> pf = generate_penalty(generate_conditions(eqns, variables=['p%s' % i for i,j
     in enumerate(x)]), join=_and)
    >>> mon = VerboseMonitor(1000)
    >>> res = fmin(objective, x0=[1/len(x)]*len(x), bounds=bounds, penalty=pf, itermon=mon, maxiter=10000)
    Generation 0 has ChiSquare: 36179.905048
    Generation 1000 has ChiSquare: 34612.689290
    Generation 2000 has ChiSquare: 32100.152737
    Generation 3000 has ChiSquare: 30305.589205
    Generation 4000 has ChiSquare: 29627.956352
    Generation 5000 has ChiSquare: 29230.517427
    Generation 6000 has ChiSquare: 28933.796220
    Generation 7000 has ChiSquare: 28804.810192
    Generation 8000 has ChiSquare: 28664.350695
    Generation 9000 has ChiSquare: 28572.124583
    Generation 10000 has ChiSquare: 28547.949725
    Warning: Maximum number of iterations has been exceeded
    STOP("EvaluationLimits with {'evaluations': 20000, 'generations': 10000}")
    

    Penalties are soft constraints, and it looks like the above still has some constraints violations, but is progressing toward a good solution.

    Anyway, the above should give you enough options to get past where you were stuck, and to find a solution.