Search code examples
python-3.xpandasscikit-learnsklearn-pandas

Pandas dataframe to array for further use


I've got a dataframe which contains a csv of selling KPIs (quantity, article number and the corresponding date) I need to split the dataframe into multiple with each containing the data to one article number (e.g. frame1= 123, frame2=345 and so on. )

How can I dynamically split like this for a further use in sklearns kmean? (match different article numbers and their selling KPI) thanks a lot


Solution

  • You can group by the article number using groupBy.

    grouped = df.groupby(['article_number'])    
    

    You can then access the individual groups using

    grouped.groups
    

    or directly apply aggregation functions like grouped.sum(['quantity']) to get a new frame with the respective values for each group.

    Also refer to the User Guide.