Search code examples
neural-networkkerastraining-data

Training and test score experiment in Neural Network


I am trying to setup a non-linear regression problem in Keras. I have two sets of data say X1 and X2 whose Y values have a similar mean and standard deviation.

The following procedure was under taken:

  • Combine the dataset X1 and X2, shuffle it and train on 30% of the data. Keras reported a training score of 3.20 RMSE and test score 3.22 RMSE
  • Use the weights from above and test against 100% of X1 data. Keras reported a test score of 23.97 RMSE
  • Use the same weights and test against 100% of X2 data. Keras reported a test score 6.49 RMSE

It is not clear to me why there is such a big difference in the Test score between X1 and X2. Is there any way i can improve the result?

For giggles, I repeated the same procedure as above but included the whole of X1 and X2 dataset instead of taking 30%.

  • Combine X1 and X2, and train on the whole dataset. Keras returned Training score 1.81 RMSE
  • Use the weights from above and test against 100% of X1 data. Keras reported a score of 22.80 RMSE
  • Testing on X2 gave a score of 7.50 RMSE

Again X2 seems to perform poorly compared with X1.


Solution

  • The problem was with scaling data appropriately. After data was rescaled to good format - model started to work.