Search code examples
pythonpandassklearn-pandas

ValueError found while trying to use pandas for multiple regression


I'm trying to run a simple multiple linear regression program using panda with a large dataset, but I'm getting an error that says: ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The code is:

from sklearn import linear_model
import pandas as pd

data = pd.read_csv('manydatas.csv')

x = data[['Bedrooms', 'City', 'Age']]
y = data['Selling Price']

line = linear_model.LinearRegression(x,y)

line.fit(x,y)

I'd appreciate any help with this, thanks

Edit: Here is a drive link to the .csv file with my data, there are over a thousand elements so I'm just linking the whole thing: https://drive.google.com/file/d/1VCNJZNKYRmUd7A6qQlDnzTbiO_7x7I3s/view?usp=sharing


Solution

  • Dont use your data inside model constructor

    line = linear_model.LinearRegression().fit(x,y)