python machine-learning data-science metrics

When I use regular run time (mins) my error is 25 but when I take a log of the runtime (to make it normal) my error is 0.3?

Im doing a typical Machine Learning regressor problem. There are 800 data points and 6 features. The best model Extra Trees Regressor returns 30 for Root Mean Square Error. I take a log transformation to make the extreme data less influential. The log transforms the data from skewed right to be normally distributed as well. The error is only 0.54 so why such a drastic change

log(30) = 1.4 I do understand my statistics knowledge is not the best but this seems quite strange for me. I haven't done any tuning to the parameters.

With that being said, what error should I believe? What is the interpretation of each?

Take a log of predicted values

pricing['runtime.min'] = np.log(pricing['runtime.min'])

Function to evaluate a model

def evaluate(model, test_features, test_labels):
    predictions = model.predict(test_features)
    #Absolute Error
    errors = metrics.mean_absolute_error(test_labels, predictions)
    #Mean Square Error
    MSerrors = metrics.mean_squared_error(test_labels, predictions)
    #Root Mean Squared Error
    RMSE = np.sqrt(metrics.mean_squared_error(test_labels, predictions))
    print('Model Perfomance')
    print('Average MAE Error: {:0.4f} degrees. '.format(errors))
    print('Average MSE Error: {:0.4f} degrees. '.format(MSerrors))
    print('Average RMS Error: {:0.4f} degrees. '.format(RMSE))
    return 'end of test'

Extra trees regressor

et_params = {'n_estimators': 1000,  'max_features':2}
et = SklearnExtra(clf = ExtraTreesRegressor(), seed = Seed, params = et_params)
et.fit(x_train, y_train)

base_models = [rf, et, gb, ada, xg]

for i in base_models:
    print('Model ' + i.name())
    print('Training: '+str(evaluate(i, x_train, y_train)))
    print('')
    print('Model ' + i.name())
    print('Test: '+ str(evaluate(i, x_test, y_test)))
    print('Test MAPE '+ str(mean_absolute_percentage_error(i, y_test, x_test)))

Model ExtraTreesRegressor(bootstrap=False, criterion='mse', max_depth=None,
                    max_features='auto', max_leaf_nodes=None,
                    min_impurity_decrease=0.0, min_impurity_split=None,
                    min_samples_leaf=1, min_samples_split=2,
                    min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,
                    oob_score=False, random_state=None, verbose=0,
                    warm_start=False)
Model Perfomance
Average MAE Error: 0.0165 degrees. 
Average MSE Error: 0.0079 degrees. 
Average RMS Error: 0.0887 degrees. 
Training: end of test

Model Perfomance
Average MAE Error: 0.3572 degrees. 
Average MSE Error: 0.2957 degrees. 
Average RMS Error: 0.5438 degrees. 
Test: end of test

Solution

We can't tell you whether you should do something, you have to decide if it makes sense for the data.

But that's why I asked in the comments about the transform and the statistics, if you changed the scale of your targets, you can't use the absolute difference between the two models for analysis. If your initial model had a RMSE of 30, but the range of predicted values is 0 to 100 and the standard deviation is 20 (for example) then that's not great. But in your new model if the data is 0 to 10 and the st.dev. is 3.5, then a rmse of .5 might be better.

The correct answer is a little bit subjective, but boils down to this: If you make predictions on real world data using the model in question, is the error metric within acceptable tolerances for the task at hand? For your initial model, is your business case such that 30 (seconds? Minutes?) or difference in run time between predicted or actual will be "close enough"? Is .54 log-milliseconds enough to render your predictions useless for the second model?

See the "Useful" part of "All models are wrong, some are useful"