I am new to Machine Learning and trying my hands on Bitcoin Price Prediction using multiple Models like Random Forest, Simple Linear Regression and NN(LSTM).
As far as I have read, Random Forest and Linear regression don't require the input feature scaling, whereas LSTM does need the input features to be scaled.
If we compare the MAE and RMSE for both algorithms (with scaling and without scaling), the result would definitely be different and I can't compare which model performs better.
How should I compare the performance of these models now?
Update - Adding my code
Data
bitcoinData = pd.DataFrame([[('2013-04-01 00:07:00'),93.25,93.30,93.30,93.25,93.300000], [('2013-04-01 00:08:00'),100.00,100.00,100.00,100.00,93.300000], [('2013-04-01 00:09:00'),93.30,93.30,93.30,93.30,33.676862]], columns=['time','open', 'close', 'high','low','volume'])
bitcoinData.time = pd.to_datetime(bitcoinData.time)
bitcoinData = bitcoinData.set_index(['time'])
x_train = train_data[['high','low','open','volume']]
y_train = train_data[['close']]
x_test = test_data[['high','low','open','volume']]
y_test = test_data[['close']]
Min-Max Scaler
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
scaler1 = MinMaxScaler(feature_range=(0, 1))
x_train = scaler.fit_transform(x_train)
y_train = scaler1.fit_transform(y_train)
x_test = scaler.transform(x_test)
y_test = scaler1.transform(y_test)
MSE Calculation
from math import sqrt
from sklearn.metrics import r2_score
from sklearn.metrics import mean_absolute_error
print("Root Mean Squared Error(RMSE) : ", sqrt(mean_squared_error(y_test,preds)))
print("Mean Absolute Error(MAE) : ", mean_absolute_error(y_test,preds))
r2 = r2_score(y_test, preds)
print("R Squared (R2) : ",r2)
You scale your input data, not the output. The input data is irrelevant to your error calculation.
If you really want to scale your LSTM output data, just scale it the same way for the other classifiers.
EDIT:
From your comment:
I only scaled my input data in LSTM
No, you don't. You do transform your output data. And from what I read, I assume you only transform it for the neural network.
So your y data for the LSTM is around 100 times smaller --> squared_error, so you get 100*100 = 10.000, which roughly is the factor your neural net performs "better" than the random forest.
Option 1:
Remove those three lines:
scaler1 = MinMaxScaler(feature_range=(0, 1))
y_train = scaler1.fit_transform(y_train)
y_test = scaler1.transform(y_test)
Don't forget to use a last layer that can output values to + infinity
Option 2:
Scale the data for your other classifiers as well and compare the scaled values.
Option 3:
Use inverse_transform(pred)
method of your MinMaxScaler()
on your predictions and calculate your errors with the inverse_transform()
ed predictions and the untransformed y_test
data.