I've got a dataframe which contains a csv of selling KPIs (quantity, article number and the corresponding date) I need to split the dataframe into multiple with each containing the data to one article number (e.g. frame1= 123, frame2=345 and so on. )
How can I dynamically split like this for a further use in sklearns kmean? (match different article numbers and their selling KPI) thanks a lot
You can group by the article number using groupBy
.
grouped = df.groupby(['article_number'])
You can then access the individual groups using
grouped.groups
or directly apply aggregation functions like grouped.sum(['quantity'])
to get a new frame with the respective values for each group.
Also refer to the User Guide.