I have a problem, working with PyCaret. Previously I did not have any problems.
But it started when I oversampled data and saved it, using pandas
and this question.
The file is here.
Then I read the file in a separate notebook.
import pycaret
from pycaret.utils import version
from pycaret.regression import *
from pycaret.classification import *
# Read clean data
starbucks_days = pd.read_csv('days_smote.csv')
# Drop a column
starbucks_days = starbucks_days.drop(['Unnamed: 0'], axis = 1)
starbucks_days = starbucks_days.drop(['transaction', 'offer_viewed', 'offer_received', 'offer_completed'], axis = 1)
starbucks_days = starbucks_days.drop(['label'], axis = 1)
The I start to use PyCaret
# Initialize Setup
starbucks_days1 = setup(starbucks_days, target = 'time_completed_viewed', session_id = 123, log_experiment = True, experiment_name = 'days1')
But get an error
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
This GitHub issue gives some hints
I check some parameters
type(starbucks_days)
pandas.core.frame.DataFrame
starbucks_days['time_completed_viewed'].value_counts()
6.000000 1682
12.000000 1503
18.000000 1318
24.000000 1212
174.000000 1068
...
444.107530 1
226.213225 1
411.947513 1
236.001744 1
394.722944 1
Name: time_completed_viewed, Length: 3572, dtype: int64
Any tips what am I missing? As I said, PyCaret works just fine with simple csv files, which were not oversampled.
In your imports, you have imported classification
after importing regression
that has overwritten the module in the environment.
This seems like a regression problem (continuous value). You don't need to import classification
.
Get rid of this line from your code and it should work fine:
from pycaret.classification import *