Search code examples
machine-learningparticle-swarm

Particle Swarm Optimisation: Converges to local optima too quickly in high dimension space


In a portfolio optimisation problem, I have a high dimension (n=500) space with upper and lower bounds of [0 - 5,000,000]. With PSO I am finding that the solution converges quickly to a local optima rather and have narrowed down the problem to a number of areas:

  1. Velocity: Particle velocity rapidly decays to extremely small step sizes [0-10] in the context of the upper/lower bounds [0 - 5,000,000]. One plug I have found is that I could change the velocity update function to a binary step size [e.g. 250,000] by using a sigmoid function but this clearly is only a plug. Any recommendations on how to motivate the velocity to remain high?
  2. Initial Feasible Solutions: When initialising 1,000 particles, I might find that only 5% are feasible solutions in the context of my constraints. I thought that I could improve the search space by re-running the initialisation until all particles start off in a feasible space but it turns out that this actually results in a worse performance and all the particles just stay stuck close to their initialisation vector.

With respect to my paremeters, w1=c1=c2=0.5. Is this likely to be the source of both problems?

I am open to any advice on this as in theory it should be a good approach to portfolio optimisation but in practice i am not seeing this.


Solution

  • Consider changing the parameters. Using w=0.5 'stabilizes' the particle and thus, preventing escape from local optima because it already converges. Furthermore, I would suggest to put the value of c1 and c2 to become larger than 1 (I think 2 is the suggested value), and maybe modify the value for c1 (Tendency to move toward global best) slightly smaller than c2 to prevent overcrowding on one solution.

    Anyway, have you tried to do the PSO with a larger amount of particles? People usually use 100-200 particles to solve 2-10 dimensional problem. I don't think 1,000 particles in 500 dimensional space will cut it. I would also suggest to use more advanced initialization method instead of normal or uniform distribution (e.g. chaotic map, Sobol sequence, Latin Hypercube sampling).