Search code examples
pythonpython-3.xjupyter-notebooklinear-regression

Python - Linear Regression Model by Group


I would like support trying to get a linear regression model built by 'ClientID' grouping.

I have managed to pull together the below functions but I am getting errors, that said I am very new to Python so would be open to any other ideas you may have.

The data below is from my dataframe dfTotal3 and the two variables I am using Steps (count of steps the client did in a day) and WeightDiff (change in weight since last input).

def model(dfTotal3, target):
    y = dfTotal3[['Steps']].values
    X = dfTotal3[['WeightDiff']].values
    return np.squeeze(LinearRegression().fit(X, y).predict(target))

def group_predictions(df, target):
    target = dfWeightComp['DTWDG']
    return dfTotal3.groupby('ClientID').apply(model, target)

group_predictions(dfTotal3, dfTotal3['DTWDG'])

Error Output:

ValueError: Expected 2D array, got 1D array instead:
array=[-0.03231707 -0.03780488 -0.04512195 -0.04615385].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Solution

  • I think the error comes from:

    return np.squeeze(LinearRegression().fit(X, y).predict(target))
    

    Specifically I speculate there is an issue with this part of the code .predict(target)

    This post is suggesting to change your target i.e.:

    target = dfWeightComp['DTWDG']

    to:

    target = dfWeightComp[['DTWDG']]