Search code examples
machine-learningtensorflowdeep-learningfinancialtflearn

Layers for predicting financial data using Tensorflow/tflearn


I'd like to predict the interest rate and I've got some relevant factors like stock index and money supply number, something like that. The number of factors may be up to 200.

For example,the training data like, X contains factors and y is the interest rate I want to train and predict.

     factor1      factor2     factor3          factor176  factor177    factor178
X= [[ 2.1428      6.1557      5.4101     ...,  5.86        6.0735      6.191 ]
    [ 2.168       6.1533      5.2315     ...,  5.8185      6.0591      6.189 ]
    [ 2.125       4.7965      3.9443     ...,  5.7845      5.9873      6.1283]...]

y= [[ 3.5593]
    [ 3.014 ]
    [ 2.7125]...]

So I want to use tensorflow/tflearn to train this model but I don't really know what method exactly I should choose to do regression. I have tried LinearRegression from tflearn before, but the result is not so great.

For now, I just use the code I found online.

net = tflearn.input_data([None, 178])
net = tflearn.fully_connected(net, 64, activation='linear',
                                weight_decay=0.0005)
net = tflearn.fully_connected(net, 1, activation='linear')
net = tflearn.regression(net, optimizer=
tflearn.optimizers.AdaGrad(learning_rate=0.01, initial_accumulator_value=0.01), 
loss='mean_square', learning_rate=0.05)
model = tflearn.DNN(net, tensorboard_verbose=0, checkpoint_path='tmp/')
model.fit(X, y, show_metric=True,
            batch_size=1, n_epoch=100)

The result is roughly 50% accuracy when the error range is ±10%. I have tried to make the window to 7 days but the result is still bad. So I want to know what additional layer I can use to make this network better.


Solution

  • First of all this network makes no sense. If you do not have any activations on your hidden units, you network is equivalent to linear regression.

    So first of all change

    net = tflearn.fully_connected(net, 64, activation='linear',
                                    weight_decay=0.0005)
    

    to

    net = tflearn.fully_connected(net, 64, activation='relu',
                                    weight_decay=0.0005)
    

    Another general thing is to always normalise your data. Your X's are big, y's are big as well - make sure they aren't, by for example whitening them (making them 0 mean and 1 std).

    Finding right architecture is hard problem and you will not find any "magical recipies" for that. Start with understanding what you are doing. Log your training, see if the training loss converges to small values, if it does not - you either do not train long enough, network is too small, or training hyperparameters are off (like too big learning right, too high regularisation etc.)