I have a dataframe. There is always data available for each date and firm. But a given row isn't guaranteed to have the data; the row only has data if that firm is True.
date IBM AAPL_total_amount IBM_total_amount AAPL_count_avg IBM_count_avg
2013-01-31 True False 29 9
2013-01-31 True True 29 9 27 5
2013-02-31 False True 27 5
2013-02-08 True True 2 3 5 6
...
How could I transpose the above dataframe to long format? Expected output:
date Firm total_amount count_avg
2013-01-31 IBM 9 5
2013-01-31 AAPL 29 27
...
Might have to add some logic to drop all the boolean masks, but once you have that it's just a stack
.
u = df.set_index('date').drop(['IBM', 'AAPL'], 1)
u.columns = u.columns.str.split('_', expand=True)
u.stack(0)
count total
date
2013-01-31 IBM 9.0 29.0
AAPL 5.0 27.0
IBM 9.0 29.0
2013-02-31 AAPL 5.0 27.0
2013-02-08 AAPL 6.0 5.0
IBM 3.0 2.0
To drop all the masks if you don't have a list of keys, possibly use select_dtypes
df.select_dtypes(exclude=[bool])