Search code examples
pythonrsummarylevels

How do I collapse categorical data into a single record in R or Python?


I have a data set structured in this fashion:

ID   Code
1     A
1     B   
1     C
2     A
2     C
3     B
3     C

However, I would like it to look like:

ID  Codes
1   A B C
2   A C
3   B C

Is there an easy way to do this in R or Python? Thanks!


Solution

  • In Python with Pandas you can do:

    import pandas as pd
    
    df = pd.read_clipboard() # from your sample
    
    df
       ID Code
    0   1    A
    1   1    B
    2   1    C
    3   2    A
    4   2    C
    5   3    B
    6   3    C
    

    df.groupby('ID').agg(lambda x: ' '.join(x['Code']))
    
         Code
    ID       
    1   A B C
    2     A C
    3     B C