Search code examples
pythonpandascrosstab

crosstab to fill with data of another column


I dont arrive to populate a crosstab with data from another colum: maybe its not the solution...

  initial dataframe              final waited

   id  id_m     X
0  10    10     a
1  10    11     b               id_m  10  11  12
2  10    12     c               id              
3  11    10     d        ->     10    a    b   c
4  11    11     e               11    d    e   f
5  11    12     f               12    g    h   i
6  12    10     g
7  12    11     h
8  12    12     i

my code to help you:

import pandas as pd

df= pd.DataFrame({'id': [10, 11,12]})
df_m = pd.merge(df.assign(key=0), df.assign(key=0), suffixes=('', '_m'), on='key').drop('key', axis=1)

# just a sample to populate the column
df_m['X'] =['a','b' ,'c','d', 'e','f','g' ,'h', 'i']

Solution

  • If your original df is this

       id  id_m  X
    0  10    10  a
    1  10    11  b
    2  10    12  c
    3  11    10  d
    4  11    11  e
    5  11    12  f
    6  12    10  g
    7  12    11  h
    8  12    12  i
    

    And all you want is this

    id_m 10 11 12
    id           
    10    a  b  c
    11    d  e  f
    12    g  h  i
    

    You can groupby the id and id_m columns, take the max of the X column, then unstack the id_m column like this.

    df.groupby([
        'id',
        'id_m'
    ]).X.max().unstack()
    

    If you really want to use pivot_table you can do this too

    df.pivot_table(index='id', columns='id_m', values='X', aggfunc='max')
    

    Same results.

    Lastly, you can use just pivot since your rows are unique with respect to the indices and columns.

    df.pivot(index='id', columns='id_m')
    

    References