Search code examples
pythonpandasdata-processing

Pandas dateframe reshape


I am playing with some date looks like below .

    song_id                        action_type ds
0   effb071415be51f11e845884e67c0f8c    1   14 days
1   f87ff481d85d2f95335ab602f38a7655    1   13 days
2   8a27d9a6c59628c991c154e8d93f412e    2   12 days
3   ecea5fe33e6817d09c395f2910479728    1   11 days
4   31a3d0420d89c9b121bb55dbdbbeda6b    1   13 days
5   096e604f7b152fad0246ae731ed8ca73    1   15 days
6   5a7d9d75b898cd1b19ef6941cc1ddccf    1   12 days
7   8a103bd3a3295fbf9b3c3bf7972db299    2   12 days
8   17e90c7b3b7ebbe4b47344fcfab2fa7a    1   11 days
9   8a27d9a6c59628c991c154e8d93f412e    2   13 days

I wonder how can i reshape it and get something like that :

song_id day1_type1 day1_type2 day1_type3 day2_type1 ......... dayn_typen
(songid) (count of type1 on day1) (Nan if no count)...... (count of typen on dayn)

Now I use

action.groupby(['song_id','ds','action_type']).action_type.sum()

and get something similar :

song_id                           ds        action_type
00088cb1e6d740fcd42bc8ff2673c805  3 days    1               1
                                  4 days    1               1
                                  13 days   2               2
                                  27 days   1               1
                                  41 days   1               1
                                            2               2
                                  42 days   1               1
                                  67 days   2               2
                                  68 days   1               1
                                  75 days   2               2
0008de587f84d8c9491502c5a5c8b466  0 days    1               4
                                            2               4
                                  1 days    1              17
                                  4 days    1               6
                                  7 days    1              10
                                  8 days    1               5

How can i rebuild or reshape what i get to get what i want ?

Thanks in advance .


Solution

  • >>> (df.groupby(['song_id', 'ds', 'action_type'])
           .action_type
           .sum()
           .unstack(['action_type', 'ds'])
           .fillna(0)
           .sortlevel(level=[0, 1], axis=1))
    
    action_type                            1                                       2        
    ds                               11 days 12 days 13 days 14 days 15 days 12 days 13 days
    song_id                                                                                 
    096e604f7b152fad0246ae731ed8ca73       0       0       0       0       1       0       0
    17e90c7b3b7ebbe4b47344fcfab2fa7a       1       0       0       0       0       0       0
    31a3d0420d89c9b121bb55dbdbbeda6b       0       0       1       0       0       0       0
    5a7d9d75b898cd1b19ef6941cc1ddccf       0       1       0       0       0       0       0
    8a103bd3a3295fbf9b3c3bf7972db299       0       0       0       0       0       2       0
    8a27d9a6c59628c991c154e8d93f412e       0       0       0       0       0       2       2
    ecea5fe33e6817d09c395f2910479728       1       0       0       0       0       0       0
    effb071415be51f11e845884e67c0f8c       0       0       0       1       0       0       0
    f87ff481d85d2f95335ab602f38a7655       0       0       1       0       0       0       0