Search code examples
pythonmachine-learningregressionlightgbm

Retrieving r2 value in negative


I have the following code applying lightgbm to the dataset(link shared below). I retrieve negative r2 of -2.0687981990506565. RMSE error I am retrieving is very low however r2 value is in negative. How can it perform badly while having very low MSE for train and test data.

weights_data = pd.read_csv("dataset.csv")
columns = weights_data.columns
target = columns[-1:]
features = columns[:-1]
def regressor_model():
  print()
  X = weights_data[features].to_numpy()
  Y = weights_data[target].to_numpy() * 100
  x_train,x_test,y_train,y_test=train_test_split(X,Y, train_size=0.8, random_state = 2021)
  regressor = lightgbm.LGBMRegressor()
  regressor.fit(x_train,y_train)
  y_pred = regressor.predict(x_test)
  r2_score_value=r2_score(y_test,y_pred)
  print(r2_score_value)
  print()
  return regressor
regressor_model()

Link for dataset https://drive.google.com/file/d/1W1G67215vNZpsU1BEiz5S4XO0XwZJhwR/view?usp=sharing

If the order of the r2 parameter is changed for instance like below, a r2 value of 0.0 is retrieved.

r2_score_value=r2_score(y_pred,y_test)

Solution

  • If you are getting negative r-square. It means your model is making a random guess. From the above code I guess you are using the default parameters of the LGBMRegressor(). You need to tune the parameters of your model. Turning the parameters might probably solve your problem.

    A you can find a similar scenario here