I have a Pandas DataFrame that looks like the following:
ID Date Element Temperature (C)
0 USW00094889 2014-11-12 TMAX 2.2
1 USC00208972 2009-04-29 TMIN 5.6
2 USC00200032 2008-05-26 TMAX 27.8
3 USC00205563 2005-11-11 TMAX 13.9
4 USC00200230 2014-02-27 TMAX -10.6
I've been staring at the documentation for the stack and unstack functions and related functions like melt and pivot_table but I can't quite seem to figure out the combination of these that I need to do what I want. I'm trying to get rid of the "Element" column and instead create two new columns, "TMAX" and "TMIN". This would make each row a unique observation. Each Date/ID combo has a TMAX and TMIN. So theoretically this new dataframe would be half the height of the current one.
Docs: https://pandas.pydata.org/pandas-docs/stable/reshaping.html
Create the DF and create the desired multi-index.
"""
ID,Date,Element,Temperature (C)
USW00094889,2014-11-12,TMAX,2.2
USC00208972,2009-04-29,TMIN,5.6
USC00200032,2008-05-26,TMAX,27.8
USC00205563,2005-11-11,TMAX,13.9
USC00200230,2014-02-27,TMAX,-10.6
"""
import pandas as pd
df = pd.read_clipboard(sep=",").set_index(["ID", "Date"])
print(df)
Output:
Element Temperature (C)
ID Date
USW00094889 2014-11-12 TMAX 2.2
USC00208972 2009-04-29 TMIN 5.6
USC00200032 2008-05-26 TMAX 27.8
USC00205563 2005-11-11 TMAX 13.9
USC00200230 2014-02-27 TMAX -10.6
df.pivot(index=df.index, columns="Element")
Output:
Temperature (C)
Element TMAX TMIN
ID Date
USC00200032 2008-05-26 27.8 NaN
USC00200230 2014-02-27 -10.6 NaN
USC00205563 2005-11-11 13.9 NaN
USC00208972 2009-04-29 NaN 5.6
USW00094889 2014-11-12 2.2 NaN
Note that I dropped the numerical ID from your sample df.