Search code examples
machine-learningneural-networkpybrain

Pybrain recurrent network for regression - how to properly kickstart trained network for predictions


I am trying to solve regression task using recurrent neural network (I use pybrain to build it). After my network is fit I want to use it to make predictions. But prediction of recurrent network is affected by its previous prediction (whih in turn is affected by prediction before it etc).

Question is - once network is trained and I want to make predictions with it on a dataset, how to properly kickstart the prediction process. If I will just call .activate() on first example from a dataset for predictions that means that the recurrent connection will pass 0 to network and it will affect the subsequent predictions in an undesireable way. Is there a way to force fully trained recurrent network to think that previous activation result was of a some special value? If yes, which value is the best here (maybe mean of possible activation output values or smth like it?)

UPDATE. Ok, since no one had any ideas within a day on how to do this with recurrent network in pybrain, let me maybe a bit change a formulation to forget about pybrain. Consider that I build a pybrain network for regression (for example, predicting price of a stock). Network will be used with a dataset which has 10 features. I add one additional feature into the dataset and fill it with previous price of from a dataset. Thus I replicate a recurrent network (aditional input neuron replicates recurrent connection). The questions are:

1) In the dataset for training I fill this additional feature with previous price. But what to do with the FIRST record in a training dataset (I don't know previous price). Should leave it 0? It should a bad idea, previous price WAS NOT zero. Should I use mean of prices in training dataset? Any other suggestions? 2) Again, same question as #1 but for running fully trained network against test dataset. While running my network against test dataset I should always pick up its prediction and put the result into this new 11th input neuron before making next prediction. But again, what to do when I need to run first prediction in dataset (since I don't know previous price)?


Solution

  • This isn't my understanding of recurrent networks at all.

    When you initially create a recurrent network the recurrent connections (say middle layer to middle layer) will be randomized, as with any other connection. This is their starting value. Each time you activate a recurrent network you'll alter those connections and thus your output will be altered.

    Carrying this logic forwards, if you wrote some code to train a recurrent network and saved it to a file, you'd have in that file a recurrent network ready to go with your real data, albeit the first invocation will contain the recurrent feedback from your last activation during the training.

    The thing you want to do is make sure that you re-save your recurrent network anytime you wish to persist it's state. For a simple FFN this wouldn't be an issue because you only change the state during training, but for a recurrent network you'll want to persist the state after any activation because the recurrent weights will have updated.

    I don't think it's the case that a recurrent network will be poisoned because of the initial value of the recurrent connections; certainly I wouldn't trust the first invocation, but given they're designed for sequences that shouldn't be an issue in either case.

    Regarding your updated question, I'm not at all convinced that arbitrarily adding a single input node will simulate this. In point of fact I suspect you'd absolutely break the networks predictive capabilities. In your example, starting with 10 input nodes, and lets pretend you have 20 middle nodes, just by adding an extra input node you'll generate an additional 20 connections to the network, that will be initially randomized. Every additional point will compound this change, and after 10 additional input nodes you'll have as many randomized connections as trained.

    I don't see this working, and I certainly don't believe it would simulate recurrent learning in the way you think.