Search code examples
pythongenetic-algorithmgenetic-programming

genetic programming for trading: How to represent chromosomes


I am working on a genetic algorithm in python that can be used for trading. The principe is simple if you are familiar with evolutionary algorithms: The genes represent trading strategies: To be more specific, each gene is a tree of this form: gene description

this can be interpreted as a boolean value like this:

If:

  • the average of the 50 last stock values is less the actual price

  • and the min of the 6 last stock values is less the actual price. then answer *True**, else False

If the answer is True, then send a BUY signal, and SELL signal otherwise.

This is an example of how I represent a tree like this in python:

class BinaryRule:
def __init__(self, child1, child2):
    self.child1 = child1
    self.child2 = child2


class LessThan(BinaryRule):
    name = '<'

    def eval(self):
        return self.child1.eval() < self.child2.eval()

# Here there is the definition of the other classes

# and then I create the tree
   tree = rules.LessThan(     
                    rules.Max( rules.Float(1) ),                 
                      rules.SMA( rules.Float(15) ),
                      )
   print tree.eval() # True or false

The problem is that I can't think of good technique for the crossover and the mutation operators. Any ideas?


Solution

  • Your operators should be safely interchangeable - should all accept the same input-types and return the same output-type (most likely float).

    Not necessary but a good idea (by way of neural networks): use differentiable operator-functions - ie instead of returning YES or NO, return "degree-of-yes-ness". This provides more graduated feedback on improvements to your expression. It can also be useful to give each operator a setpoint control value to tweak its behavior.

    A mutation operator can change an operator's type or setpoint; a crossover operation usually swaps sub-trees.

    A fairly standard binary tree representation is in the form of a list like [head, left, right, left-left, left-right, right-left, right-right, left-left-left, ...]. The nice thing is that, given the index of any node, you can immediately calculate the parent, left-child, and right-child indices; a downside is that it can waste a lot of space if you have a sparse tree structure. Also, properly copying subtrees should be done via a recursive function, not a simple slice-and-splice as @Slater shows above.