I used XGBRegressor to fit a small dataset, with (data_size, feature_size) = (156, 328)
. Although random_state
is given, the train/val history can not be reproduced each time I executed the program, sometimes the training process was fine, sometimes there was train/val exploding issue. Why was the random_state
effectless? How can I fix the exploding-loss issue?
Code:
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
import pandas as pd
SEED = 123
dfs = pd.read_csv('./dir')
dfs_train, dfs_cv = train_test_split(dfs, train_size=0.8, shuffle=False, random_state=SEED)
df_train_x = dfs_train.drop(columns='Y')
df_train_y = dfs_train['Y']
df_cv_x = dfs_cv.drop(columns='Y')
df_cv_y = dfs_cv['Y']
params = {"booster":"gblinear",
"eval_metric": "rmse",
"predictor": "cpu_predictor",
"max_depth": 16,
"n_estimators":100,
'random_state':SEED
}
model = XGBRegressor(**params)
model.fit(df_train_x.values, df_train_y.values,
eval_set=[(df_train_x.values, df_train_y.values), (df_cv_x.values, df_cv_y.values)],
eval_metric='rmse',
verbose=True)
output 1 (exploding):
[0] validation_0-rmse:1.75475 validation_1-rmse:1.88660
[1] validation_0-rmse:1.25838 validation_1-rmse:1.67099
[2] validation_0-rmse:1.09559 validation_1-rmse:1.52534
[3] validation_0-rmse:1.13592 validation_1-rmse:1.36564
[4] validation_0-rmse:1.17923 validation_1-rmse:1.18143
[5] validation_0-rmse:1.02157 validation_1-rmse:1.34878
[6] validation_0-rmse:0.83439 validation_1-rmse:1.26116
[7] validation_0-rmse:0.75650 validation_1-rmse:1.32562
[8] validation_0-rmse:0.69412 validation_1-rmse:1.26147
[9] validation_0-rmse:0.65568 validation_1-rmse:1.11168
[10] validation_0-rmse:0.62501 validation_1-rmse:1.13932
[11] validation_0-rmse:0.61957 validation_1-rmse:1.17217
[12] validation_0-rmse:0.58313 validation_1-rmse:1.17873
[13] validation_0-rmse:0.69826 validation_1-rmse:1.28131
[14] validation_0-rmse:0.65318 validation_1-rmse:1.23954
[15] validation_0-rmse:0.69506 validation_1-rmse:1.17325
[16] validation_0-rmse:0.90857 validation_1-rmse:1.16924
[17] validation_0-rmse:1.29021 validation_1-rmse:1.23918
[18] validation_0-rmse:0.86403 validation_1-rmse:1.10940
[19] validation_0-rmse:0.74296 validation_1-rmse:1.09483
[20] validation_0-rmse:0.66514 validation_1-rmse:1.03155
[21] validation_0-rmse:0.60940 validation_1-rmse:0.97993
[22] validation_0-rmse:0.57345 validation_1-rmse:0.91434
[23] validation_0-rmse:0.56455 validation_1-rmse:0.95662
[24] validation_0-rmse:0.51317 validation_1-rmse:0.91908
[25] validation_0-rmse:0.61795 validation_1-rmse:1.19921
[26] validation_0-rmse:0.52034 validation_1-rmse:0.96785
[27] validation_0-rmse:0.79248 validation_1-rmse:1.32662
[28] validation_0-rmse:0.61955 validation_1-rmse:1.02642
[29] validation_0-rmse:0.59526 validation_1-rmse:1.12646
[30] validation_0-rmse:0.78931 validation_1-rmse:1.28633
[31] validation_0-rmse:0.50458 validation_1-rmse:1.08621
[32] validation_0-rmse:0.83105 validation_1-rmse:1.56490
[33] validation_0-rmse:0.62568 validation_1-rmse:1.38425
[34] validation_0-rmse:0.59277 validation_1-rmse:1.32925
[35] validation_0-rmse:0.54544 validation_1-rmse:1.30204
[36] validation_0-rmse:0.54612 validation_1-rmse:1.34128
[37] validation_0-rmse:0.54343 validation_1-rmse:1.36388
[38] validation_0-rmse:2.05047 validation_1-rmse:2.63729
[39] validation_0-rmse:7.35043 validation_1-rmse:7.61231
[40] validation_0-rmse:6.88989 validation_1-rmse:5.74990
[41] validation_0-rmse:6.68002 validation_1-rmse:6.98875
[42] validation_0-rmse:8.64272 validation_1-rmse:6.07278
[43] validation_0-rmse:5.42061 validation_1-rmse:4.87993
[44] validation_0-rmse:6.02975 validation_1-rmse:6.28529
[45] validation_0-rmse:5.53219 validation_1-rmse:6.61440
[46] validation_0-rmse:21.73743 validation_1-rmse:12.64479
[47] validation_0-rmse:14.01517 validation_1-rmse:15.05459
[48] validation_0-rmse:9.78612 validation_1-rmse:12.35174
[49] validation_0-rmse:8.14741 validation_1-rmse:10.34468
[50] validation_0-rmse:7.37258 validation_1-rmse:9.14025
[51] validation_0-rmse:13.28054 validation_1-rmse:15.57369
[52] validation_0-rmse:9.72434 validation_1-rmse:8.82560
[53] validation_0-rmse:7.43478 validation_1-rmse:8.69813
[54] validation_0-rmse:6.99072 validation_1-rmse:7.90911
[55] validation_0-rmse:6.33418 validation_1-rmse:7.16309
[56] validation_0-rmse:5.98817 validation_1-rmse:6.86138
[57] validation_0-rmse:6.63810 validation_1-rmse:7.32003
[58] validation_0-rmse:12.34689 validation_1-rmse:17.12449
[59] validation_0-rmse:11.46232 validation_1-rmse:11.11735
[60] validation_0-rmse:8.22308 validation_1-rmse:8.42130
[61] validation_0-rmse:8.03585 validation_1-rmse:9.78268
[62] validation_0-rmse:6.08736 validation_1-rmse:9.08017
[63] validation_0-rmse:5.65990 validation_1-rmse:9.01591
[64] validation_0-rmse:4.94540 validation_1-rmse:8.60943
[65] validation_0-rmse:12.16186 validation_1-rmse:9.97841
[66] validation_0-rmse:24.36063 validation_1-rmse:30.90603
[67] validation_0-rmse:23.63998 validation_1-rmse:15.92554
[68] validation_0-rmse:38.54043 validation_1-rmse:49.15125
[69] validation_0-rmse:26.96050 validation_1-rmse:35.93348
[70] validation_0-rmse:36.68499 validation_1-rmse:35.61835
[71] validation_0-rmse:44.18962 validation_1-rmse:41.25709
[72] validation_0-rmse:35.57274 validation_1-rmse:36.54894
[73] validation_0-rmse:32.26445 validation_1-rmse:37.02519
[74] validation_0-rmse:38.02793 validation_1-rmse:60.88339
[75] validation_0-rmse:29.93598 validation_1-rmse:46.07689
[76] validation_0-rmse:26.86872 validation_1-rmse:41.39200
[77] validation_0-rmse:24.87459 validation_1-rmse:41.77614
[78] validation_0-rmse:29.63828 validation_1-rmse:27.51796
[79] validation_0-rmse:23.43373 validation_1-rmse:36.54044
[80] validation_0-rmse:21.80307 validation_1-rmse:38.42451
[81] validation_0-rmse:45.01890 validation_1-rmse:63.13959
[82] validation_0-rmse:32.98600 validation_1-rmse:48.51588
[83] validation_0-rmse:1154.83826 validation_1-rmse:1046.83862
[84] validation_0-rmse:596.76422 validation_1-rmse:899.20294
[85] validation_0-rmse:8772.32227 validation_1-rmse:14788.31152
[86] validation_0-rmse:15234.09082 validation_1-rmse:14237.62500
[87] validation_0-rmse:12527.86426 validation_1-rmse:13914.09277
[88] validation_0-rmse:11000.84277 validation_1-rmse:13445.76074
[89] validation_0-rmse:15696.28613 validation_1-rmse:13946.85840
[90] validation_0-rmse:85210.62500 validation_1-rmse:127271.79688
[91] validation_0-rmse:116500.62500 validation_1-rmse:215355.65625
[92] validation_0-rmse:149855.62500 validation_1-rmse:147734.62500
[93] validation_0-rmse:151028.76562 validation_1-rmse:97522.35938
[94] validation_0-rmse:286164.06250 validation_1-rmse:359728.84375
[95] validation_0-rmse:149474.23438 validation_1-rmse:182052.50000
[96] validation_0-rmse:156148.78125 validation_1-rmse:217708.90625
[97] validation_0-rmse:114551.62500 validation_1-rmse:151682.79688
[98] validation_0-rmse:104612.85156 validation_1-rmse:170244.31250
[99] validation_0-rmse:256178.57812 validation_1-rmse:246638.64062
output 2 (fine):
[0] validation_0-rmse:2.73642 validation_1-rmse:2.73807
[1] validation_0-rmse:0.49221 validation_1-rmse:0.80462
[2] validation_0-rmse:0.31022 validation_1-rmse:0.73898
[3] validation_0-rmse:0.26974 validation_1-rmse:0.76231
[4] validation_0-rmse:0.22617 validation_1-rmse:0.61529
[5] validation_0-rmse:0.20344 validation_1-rmse:0.66840
[6] validation_0-rmse:0.18369 validation_1-rmse:0.62763
[7] validation_0-rmse:0.17476 validation_1-rmse:0.64966
[8] validation_0-rmse:0.16620 validation_1-rmse:0.60988
[9] validation_0-rmse:0.16017 validation_1-rmse:0.62756
[10] validation_0-rmse:0.15479 validation_1-rmse:0.61354
[11] validation_0-rmse:0.15247 validation_1-rmse:0.63041
[12] validation_0-rmse:0.14641 validation_1-rmse:0.58863
[13] validation_0-rmse:0.14544 validation_1-rmse:0.55724
[14] validation_0-rmse:0.16165 validation_1-rmse:0.54285
[15] validation_0-rmse:0.14305 validation_1-rmse:0.59282
[16] validation_0-rmse:0.13728 validation_1-rmse:0.57130
[17] validation_0-rmse:0.13325 validation_1-rmse:0.56199
[18] validation_0-rmse:0.12974 validation_1-rmse:0.53802
[19] validation_0-rmse:0.12596 validation_1-rmse:0.54721
[20] validation_0-rmse:0.12342 validation_1-rmse:0.54109
[21] validation_0-rmse:0.12143 validation_1-rmse:0.53365
[22] validation_0-rmse:0.11954 validation_1-rmse:0.53702
[23] validation_0-rmse:0.11721 validation_1-rmse:0.52632
[24] validation_0-rmse:0.11521 validation_1-rmse:0.52671
[25] validation_0-rmse:0.11325 validation_1-rmse:0.51527
[26] validation_0-rmse:0.11148 validation_1-rmse:0.51392
[27] validation_0-rmse:0.10978 validation_1-rmse:0.49357
[28] validation_0-rmse:0.10803 validation_1-rmse:0.50030
[29] validation_0-rmse:0.10657 validation_1-rmse:0.49821
[30] validation_0-rmse:0.10624 validation_1-rmse:0.47754
[31] validation_0-rmse:0.10450 validation_1-rmse:0.48614
[32] validation_0-rmse:0.10336 validation_1-rmse:0.47555
[33] validation_0-rmse:0.10213 validation_1-rmse:0.47663
[34] validation_0-rmse:0.10139 validation_1-rmse:0.47462
[35] validation_0-rmse:0.09979 validation_1-rmse:0.46085
[36] validation_0-rmse:0.09875 validation_1-rmse:0.46658
[37] validation_0-rmse:0.09780 validation_1-rmse:0.46026
[38] validation_0-rmse:0.09702 validation_1-rmse:0.45724
[39] validation_0-rmse:0.09638 validation_1-rmse:0.46206
[40] validation_0-rmse:0.09570 validation_1-rmse:0.46017
[41] validation_0-rmse:0.09500 validation_1-rmse:0.45447
[42] validation_0-rmse:0.09431 validation_1-rmse:0.45097
[43] validation_0-rmse:0.09371 validation_1-rmse:0.45112
[44] validation_0-rmse:0.09322 validation_1-rmse:0.44389
[45] validation_0-rmse:0.09271 validation_1-rmse:0.45073
[46] validation_0-rmse:0.09199 validation_1-rmse:0.44402
[47] validation_0-rmse:0.09145 validation_1-rmse:0.44305
[48] validation_0-rmse:0.09091 validation_1-rmse:0.43982
[49] validation_0-rmse:0.09028 validation_1-rmse:0.43441
[50] validation_0-rmse:0.09004 validation_1-rmse:0.44175
[51] validation_0-rmse:0.08931 validation_1-rmse:0.43299
[52] validation_0-rmse:0.09034 validation_1-rmse:0.41695
[53] validation_0-rmse:0.08860 validation_1-rmse:0.41444
[54] validation_0-rmse:0.08798 validation_1-rmse:0.40965
[55] validation_0-rmse:0.08734 validation_1-rmse:0.41013
[56] validation_0-rmse:0.08744 validation_1-rmse:0.39615
[57] validation_0-rmse:0.08636 validation_1-rmse:0.40437
[58] validation_0-rmse:0.08597 validation_1-rmse:0.40617
[59] validation_0-rmse:0.08559 validation_1-rmse:0.40638
[60] validation_0-rmse:0.08518 validation_1-rmse:0.41139
[61] validation_0-rmse:0.08472 validation_1-rmse:0.40855
[62] validation_0-rmse:0.08427 validation_1-rmse:0.40601
[63] validation_0-rmse:0.08386 validation_1-rmse:0.40446
[64] validation_0-rmse:0.08357 validation_1-rmse:0.40676
[65] validation_0-rmse:0.08347 validation_1-rmse:0.39509
[66] validation_0-rmse:0.08295 validation_1-rmse:0.40182
[67] validation_0-rmse:0.08269 validation_1-rmse:0.40343
[68] validation_0-rmse:0.08294 validation_1-rmse:0.39187
[69] validation_0-rmse:0.08231 validation_1-rmse:0.39857
[70] validation_0-rmse:0.08200 validation_1-rmse:0.39805
[71] validation_0-rmse:0.08178 validation_1-rmse:0.39975
[72] validation_0-rmse:0.08200 validation_1-rmse:0.40522
[73] validation_0-rmse:0.08104 validation_1-rmse:0.40048
[74] validation_0-rmse:0.08073 validation_1-rmse:0.39871
[75] validation_0-rmse:0.08041 validation_1-rmse:0.39395
[76] validation_0-rmse:0.08022 validation_1-rmse:0.39725
[77] validation_0-rmse:0.07989 validation_1-rmse:0.39610
[78] validation_0-rmse:0.07964 validation_1-rmse:0.39375
[79] validation_0-rmse:0.07942 validation_1-rmse:0.38979
[80] validation_0-rmse:0.07920 validation_1-rmse:0.39015
[81] validation_0-rmse:0.07914 validation_1-rmse:0.38749
[82] validation_0-rmse:0.07890 validation_1-rmse:0.38585
[83] validation_0-rmse:0.07868 validation_1-rmse:0.38665
[84] validation_0-rmse:0.07842 validation_1-rmse:0.38147
[85] validation_0-rmse:0.07819 validation_1-rmse:0.38246
[86] validation_0-rmse:0.07805 validation_1-rmse:0.38351
[87] validation_0-rmse:0.07796 validation_1-rmse:0.37884
[88] validation_0-rmse:0.07770 validation_1-rmse:0.38242
[89] validation_0-rmse:0.07750 validation_1-rmse:0.37763
[90] validation_0-rmse:0.07724 validation_1-rmse:0.37871
[91] validation_0-rmse:0.07702 validation_1-rmse:0.37974
[92] validation_0-rmse:0.07679 validation_1-rmse:0.38147
[93] validation_0-rmse:0.07664 validation_1-rmse:0.37735
[94] validation_0-rmse:0.07644 validation_1-rmse:0.37873
[95] validation_0-rmse:0.07632 validation_1-rmse:0.37661
[96] validation_0-rmse:0.07610 validation_1-rmse:0.37877
[97] validation_0-rmse:0.07587 validation_1-rmse:0.37659
[98] validation_0-rmse:0.07572 validation_1-rmse:0.37648
[99] validation_0-rmse:0.07556 validation_1-rmse:0.37356
UPDATED:
By using the public Boston housing dataset, and setting nthread=1
, the training process became reproducible without exploding issues. It seems that the problem lies in my dataset. Code and output are as following:
Code:
from sklearn.datasets import load_boston
import sklearn
from xgboost import XGBRegressor
import pandas as pd
import numpy as np
SEED = 123
X, y = load_boston(return_X_y=True)
np.random.seed(SEED)
indices = np.random.permutation(X.shape[0])
training_idx, test_idx = indices[:80], indices[80:]
train_X, test_X = X[training_idx,:], X[test_idx,:]
train_y, test_y = y[training_idx], y[test_idx]
params = {"booster":"gblinear",
"eval_metric": "rmse",
"predictor": "cpu_predictor",
"max_depth": 16,
"n_estimators":100,
'random_state':SEED,
'nthread':1,
'early_stopping_rounds':5
}
model = XGBRegressor(**params)
model.get_xgb_params()
model.fit(train_X, train_y,
eval_set=[(train_X, train_y), (test_X, test_y)],
eval_metric='rmse',
verbose=True)
output:
{'objective': 'reg:squarederror',
'base_score': None,
'booster': 'gblinear',
'colsample_bylevel': None,
'colsample_bynode': None,
'colsample_bytree': None,
'gamma': None,
'gpu_id': None,
'interaction_constraints': None,
'learning_rate': None,
'max_delta_step': None,
'max_depth': 16,
'min_child_weight': None,
'monotone_constraints': None,
'n_jobs': None,
'num_parallel_tree': None,
'random_state': 123,
'reg_alpha': None,
'reg_lambda': None,
'scale_pos_weight': None,
'subsample': None,
'tree_method': None,
'validate_parameters': None,
'verbosity': None,
'eval_metric': 'rmse',
'predictor': 'cpu_predictor',
'nthread': 1,
'early_stopping_rounds': 5}
Parameters: { early_stopping_rounds, max_depth, predictor } might not be used.
This may not be accurate due to some parameters are only used in language bindings but
passed down to XGBoost core. Or some parameters are not used but slip through this
verification. Please open an issue if you find above cases.
[0] validation_0-rmse:8.38695 validation_1-rmse:8.88360
[1] validation_0-rmse:7.56356 validation_1-rmse:8.06591
[2] validation_0-rmse:7.24844 validation_1-rmse:7.71700
[3] validation_0-rmse:7.03799 validation_1-rmse:7.46547
[4] validation_0-rmse:6.86494 validation_1-rmse:7.25173
[5] validation_0-rmse:6.71517 validation_1-rmse:7.06397
[6] validation_0-rmse:6.58385 validation_1-rmse:6.89819
[7] validation_0-rmse:6.46814 validation_1-rmse:6.75184
[8] validation_0-rmse:6.36585 validation_1-rmse:6.62274
[9] validation_0-rmse:6.27512 validation_1-rmse:6.50893
[10] validation_0-rmse:6.19437 validation_1-rmse:6.40863
[11] validation_0-rmse:6.12223 validation_1-rmse:6.32025
[12] validation_0-rmse:6.05754 validation_1-rmse:6.24240
[13] validation_0-rmse:5.99930 validation_1-rmse:6.17386
[14] validation_0-rmse:5.94666 validation_1-rmse:6.11355
[15] validation_0-rmse:5.89892 validation_1-rmse:6.06053
[16] validation_0-rmse:5.85546 validation_1-rmse:6.01398
[17] validation_0-rmse:5.81576 validation_1-rmse:5.97318
[18] validation_0-rmse:5.77938 validation_1-rmse:5.93750
[19] validation_0-rmse:5.74595 validation_1-rmse:5.90638
[20] validation_0-rmse:5.71514 validation_1-rmse:5.87933
[21] validation_0-rmse:5.68669 validation_1-rmse:5.85592
[22] validation_0-rmse:5.66035 validation_1-rmse:5.83575
[23] validation_0-rmse:5.63591 validation_1-rmse:5.81850
[24] validation_0-rmse:5.61321 validation_1-rmse:5.80385
[25] validation_0-rmse:5.59208 validation_1-rmse:5.79153
[26] validation_0-rmse:5.57239 validation_1-rmse:5.78130
[27] validation_0-rmse:5.55401 validation_1-rmse:5.77294
[28] validation_0-rmse:5.53685 validation_1-rmse:5.76626
[29] validation_0-rmse:5.52081 validation_1-rmse:5.76107
[30] validation_0-rmse:5.50579 validation_1-rmse:5.75723
[31] validation_0-rmse:5.49174 validation_1-rmse:5.75458
[32] validation_0-rmse:5.47856 validation_1-rmse:5.75300
[33] validation_0-rmse:5.46621 validation_1-rmse:5.75237
[34] validation_0-rmse:5.45463 validation_1-rmse:5.75258
[35] validation_0-rmse:5.44376 validation_1-rmse:5.75354
[36] validation_0-rmse:5.43355 validation_1-rmse:5.75516
[37] validation_0-rmse:5.42396 validation_1-rmse:5.75736
[38] validation_0-rmse:5.41496 validation_1-rmse:5.76008
[39] validation_0-rmse:5.40649 validation_1-rmse:5.76324
[40] validation_0-rmse:5.39853 validation_1-rmse:5.76679
[41] validation_0-rmse:5.39104 validation_1-rmse:5.77068
[42] validation_0-rmse:5.38399 validation_1-rmse:5.77484
[43] validation_0-rmse:5.37735 validation_1-rmse:5.77926
[44] validation_0-rmse:5.37111 validation_1-rmse:5.78387
[45] validation_0-rmse:5.36522 validation_1-rmse:5.78865
[46] validation_0-rmse:5.35967 validation_1-rmse:5.79357
[47] validation_0-rmse:5.35443 validation_1-rmse:5.79859
[48] validation_0-rmse:5.34949 validation_1-rmse:5.80369
[49] validation_0-rmse:5.34483 validation_1-rmse:5.80885
[50] validation_0-rmse:5.34042 validation_1-rmse:5.81403
[51] validation_0-rmse:5.33626 validation_1-rmse:5.81924
[52] validation_0-rmse:5.33231 validation_1-rmse:5.82444
[53] validation_0-rmse:5.32859 validation_1-rmse:5.82962
[54] validation_0-rmse:5.32505 validation_1-rmse:5.83477
[55] validation_0-rmse:5.32170 validation_1-rmse:5.83988
[56] validation_0-rmse:5.31853 validation_1-rmse:5.84493
[57] validation_0-rmse:5.31551 validation_1-rmse:5.84992
[58] validation_0-rmse:5.31265 validation_1-rmse:5.85483
[59] validation_0-rmse:5.30992 validation_1-rmse:5.85966
[60] validation_0-rmse:5.30732 validation_1-rmse:5.86441
[61] validation_0-rmse:5.30485 validation_1-rmse:5.86906
[62] validation_0-rmse:5.30249 validation_1-rmse:5.87362
[63] validation_0-rmse:5.30024 validation_1-rmse:5.87808
[64] validation_0-rmse:5.29808 validation_1-rmse:5.88244
[65] validation_0-rmse:5.29602 validation_1-rmse:5.88668
[66] validation_0-rmse:5.29404 validation_1-rmse:5.89082
[67] validation_0-rmse:5.29215 validation_1-rmse:5.89485
[68] validation_0-rmse:5.29033 validation_1-rmse:5.89877
[69] validation_0-rmse:5.28858 validation_1-rmse:5.90257
[70] validation_0-rmse:5.28690 validation_1-rmse:5.90626
[71] validation_0-rmse:5.28527 validation_1-rmse:5.90984
[72] validation_0-rmse:5.28371 validation_1-rmse:5.91331
[73] validation_0-rmse:5.28219 validation_1-rmse:5.91666
[74] validation_0-rmse:5.28073 validation_1-rmse:5.91990
[75] validation_0-rmse:5.27931 validation_1-rmse:5.92303
[76] validation_0-rmse:5.27794 validation_1-rmse:5.92605
[77] validation_0-rmse:5.27661 validation_1-rmse:5.92896
[78] validation_0-rmse:5.27531 validation_1-rmse:5.93176
[79] validation_0-rmse:5.27405 validation_1-rmse:5.93445
[80] validation_0-rmse:5.27282 validation_1-rmse:5.93704
[81] validation_0-rmse:5.27163 validation_1-rmse:5.93953
[82] validation_0-rmse:5.27046 validation_1-rmse:5.94192
[83] validation_0-rmse:5.26932 validation_1-rmse:5.94420
[84] validation_0-rmse:5.26820 validation_1-rmse:5.94639
[85] validation_0-rmse:5.26711 validation_1-rmse:5.94848
[86] validation_0-rmse:5.26604 validation_1-rmse:5.95048
[87] validation_0-rmse:5.26499 validation_1-rmse:5.95238
[88] validation_0-rmse:5.26396 validation_1-rmse:5.95420
[89] validation_0-rmse:5.26294 validation_1-rmse:5.95592
[90] validation_0-rmse:5.26195 validation_1-rmse:5.95756
[91] validation_0-rmse:5.26097 validation_1-rmse:5.95912
[92] validation_0-rmse:5.26000 validation_1-rmse:5.96059
[93] validation_0-rmse:5.25905 validation_1-rmse:5.96198
[94] validation_0-rmse:5.25811 validation_1-rmse:5.96329
[95] validation_0-rmse:5.25718 validation_1-rmse:5.96453
[96] validation_0-rmse:5.25627 validation_1-rmse:5.96569
[97] validation_0-rmse:5.25537 validation_1-rmse:5.96678
[98] validation_0-rmse:5.25447 validation_1-rmse:5.96779
[99] validation_0-rmse:5.25359 validation_1-rmse:5.96874
Weird. Here's what I'd try, in order:
get_params/get_xgb_params()
on your XGBRegressor
model to make sure it actually used the random_state
parameter you passed in. Ditto, look at the verbose log to make sure training used it.y
. Is its distribution very weird, non-continuous? Please show us a plot or histogram? or at least some summary statistics (min, max, mean, median, sd, 1st and 3rd quartiles)? Is the un-stratified split affecting your training? (show the descriptive statistics before and after split, also on the eval set, these three sets shouldn't differ wildly). Is it easier to try to model log(y), sqrt(y), exp(y) or somesuch? Can you debug which rows are contributing to the CV error?
eval_metric='rmse'
and default objective reg:squarederror
are very unsuitable to your target variable y
. We should be able to tell from the plot of y, but try other [eval_metric
s and objective
s] (https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters) like eval_metric=='logloss'
, 'mae'
, 'rmsle'
, etc. See that doc for full list of objective
, eval_metric
.nthread
parameter to 1 (single-core). The default is nthread==-1
(use all cores). Then rerun runs 1 and 2 and update the results in your question.