I have the following code applying lightgbm to the dataset(link shared below). I retrieve negative r2 of -2.0687981990506565. RMSE error I am retrieving is very low however r2 value is in negative. How can it perform badly while having very low MSE for train and test data.
weights_data = pd.read_csv("dataset.csv")
columns = weights_data.columns
target = columns[-1:]
features = columns[:-1]
def regressor_model():
print()
X = weights_data[features].to_numpy()
Y = weights_data[target].to_numpy() * 100
x_train,x_test,y_train,y_test=train_test_split(X,Y, train_size=0.8, random_state = 2021)
regressor = lightgbm.LGBMRegressor()
regressor.fit(x_train,y_train)
y_pred = regressor.predict(x_test)
r2_score_value=r2_score(y_test,y_pred)
print(r2_score_value)
print()
return regressor
regressor_model()
Link for dataset https://drive.google.com/file/d/1W1G67215vNZpsU1BEiz5S4XO0XwZJhwR/view?usp=sharing
If the order of the r2 parameter is changed for instance like below, a r2 value of 0.0 is retrieved.
r2_score_value=r2_score(y_pred,y_test)
If you are getting negative r-square. It means your model is making a random guess. From the above code I guess you are using the default parameters of the LGBMRegressor(). You need to tune the parameters of your model. Turning the parameters might probably solve your problem.
A you can find a similar scenario here