Search code examples
tensorflowkerasscikit-learntransformstandardized

How to use a saved neural network on new data after standardization


I have built and trained a neural network with TensorFlow and Keras that works pretty well for my data. Before I used the data, I standardized it with the StandardScale() from sklearn. I fit_transform() the training data and only transform() the test and validation data.

In the end, I saved my model.

Now I want to use the model for new data. I guess I need to transform() this data as well, but how do I do it?

My data was transformed with all parameters from the fit_transform() function for the training dataset. If I used fit_transform() in my new data, I would get worse results than transforming them the same way I did the validation and test data.

Is there a way to store the information from the fit_transform() function to use it later when I load my saved model? So I would get a new data set, transformed liked the test and validation data?


Solution

  • When making predictions you shouldn't use fit_transform as this will overwrite your previous fitted scaler. Note that fit_transform is the combination of two methods fit() and transform() which you can run individually if you wish to the same effect.

    So, to use the fitted scaler without altering it, just use transform().

    To save it you can always use pickle:

    import pickle 
    
    with open(os.path.join(<your_path>, 'scaler.pkl'), 'wb') as output:
         pickle.dump(<your_scaler>, output)