Mock data:
df = pd.DataFrame({
'id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'country': ['USA', 'USA', 'USA', 'USA', 'USA', 'Canada', 'Canada', 'Canada', 'USA', 'Canada']
})
Let's say I want to sample one observation for each country:
df.groupby('country').sample(1)
I get this error:
AttributeError: Cannot access callable attribute 'sample' of 'DataFrameGroupBy' objects, try using the 'apply' method
I have tried to reset the index, it didn't solve the problem. I have also tried the answer here, it didn't work. What am I doing wrong?
EDIT: this question has a follow up here.
As the per the error use apply()
. group_keys=False
will remove the additional index of country
.
>>> df.groupby('country', group_keys=False).apply(lambda df: df.sample(1))
id country
6 7 Canada
2 3 USA
Edit:
Seems to be a mismatch of Pandas versions as groupby
was introduced in version 1.1.0
. I ran the OPs code and it works as well.
You will need to upgrade pandas using pip3 install --upgrade pandas