Search code examples
pythonpython-3.xpandasdataframetranspose

Transpose a dataframe and melt


I have a dataframe. There is always data available for each date and firm. But a given row isn't guaranteed to have the data; the row only has data if that firm is True.

    date        IBM       AAPL_total_amount    IBM_total_amount   AAPL_count_avg  IBM_count_avg 
    2013-01-31  True    False    29                9
    2013-01-31  True    True     29                9                 27               5
    2013-02-31  False   True                                         27               5
    2013-02-08  True    True     2                 3                  5                6
      ...

How could I transpose the above dataframe to long format? Expected output:

     date        Firm     total_amount  count_avg
    2013-01-31   IBM         9              5   
    2013-01-31   AAPL        29             27
      ...

Solution

  • Might have to add some logic to drop all the boolean masks, but once you have that it's just a stack.


    u = df.set_index('date').drop(['IBM', 'AAPL'], 1)
    u.columns = u.columns.str.split('_', expand=True)
    u.stack(0)
    

                     count  total
    date
    2013-01-31 IBM     9.0   29.0
               AAPL    5.0   27.0
               IBM     9.0   29.0
    2013-02-31 AAPL    5.0   27.0
    2013-02-08 AAPL    6.0    5.0
               IBM     3.0    2.0
    

    To drop all the masks if you don't have a list of keys, possibly use select_dtypes

    df.select_dtypes(exclude=[bool])