I have some input dataframe as the following example:
import pandas as pd
df = pd.DataFrame(
{
"Name":["Ale", "Dan", "Hel"],
"Project": ["A", "A", "B"],
"10/04/2023 Formation": [24, 40, 40],
"17/04/2023 Formation": [12, 24, 24],
"10/04/2023 Holidays": [40, 40, 40],
"17/04/2023 Holidays": [12, 40, 24],
}
)
As can be seen, some of the columns are dates belonging to a specific concept (Formation, Holidays, in this case).
What I want is to normalize these columns to convert them into rows (unpivot), but preserving (grouping) each project in its own column; that is, I would like to perform a dataframe transformation whose output would be something like:
Date Name Project Formation Holidays
0 10/04/2023 Ale A 24 40
1 17/04/2023 Ale A 12 12
2 10/04/2023 Dan A 40 40
3 17/04/2023 Dan A 24 40
4 10/04/2023 Hel B 40 40
5 17/04/2023 Hel B 24 40
import pandas as pd
df = pd.DataFrame(
{
"Name":["Ale", "Dan", "Hel"],
"Project": ["A", "A", "B"],
"10/04/2023 Formation": [24, 40, 40],
"17/04/2023 Formation": [12, 24, 24],
"10/04/2023 Holidays": [40, 40, 40],
"17/04/2023 Holidays": [12, 40, 24],
}
)
# Unpivot the data using pd.melt()
df_melted = pd.melt(df, id_vars=["Name", "Project"], var_name="Date_Concept", value_name="Value")
# Separate the Date and Concept columns
df_melted[["Date", "Concept"]] = df_melted["Date_Concept"].str.split(" ", expand=True)
df_melted = df_melted.drop("Date_Concept", axis=1)
# Use pd.pivot_table() to reshape the dataframe
df_final = pd.pivot_table(df_melted, index=["Date", "Name", "Project"], columns="Concept", values="Value").reset_index()
# Reorder the columns
df_final = df_final[["Date", "Name", "Project", "Formation", "Holidays"]].sort_values(["Name", "Date", "Project"])
print(df_final)