I am a complete newbie with Python and I am trying to solve the following problem...
I have read in an excel file with pandas that looks something like this:
Country | 2000 | 2001 | 2002 | 2003 | 2004 | ... |
---|---|---|---|---|---|---|
Argentina | 1.23 | 4.56 | 7.89 | 10.11 | 12.13 | ... |
Brazil | 14.15 | 16.17 | 18.19 | 20.21 | 22.23 | ... |
Chile | ... | ... | ... | ... | ... | ... |
Colombia | ... | ... | ... | ... | ... | ... |
And what I try to get is something like this:
Country | Year | Value |
---|---|---|
Argentina | 2000 | 1.23 |
Argentina | 2001 | 4.56 |
Argentina | 2002 | 7.89 |
Argentina | 2003 | 10.11 |
Argentina | 2004 | 12.13 |
... | ... | ... |
Brazil | 2000 | 14.15 |
Brazil | 2001 | 16.17 |
Brazil | 2002 | 18.19 |
Brazil | 2003 | 20.21 |
Brazil | 2004 | 22.23 |
... | ... | ... |
Chile | 2000 | ... |
... | ... | ... |
I have already read a few posts about transposing, but have not found anything that addresses my problem.
Can someone help here?
pandas.melt is useful for you to unpivot a dataframe.
https://pandas.pydata.org/docs/reference/api/pandas.melt.html
So, you can transform the dataframe to what you want as follows:
>>> import pandas as pd
>>> pd.__version__
'1.2.4'
>>> df = pd.DataFrame({
... "Country": ["Argentina", "Brazil",],
... "2000": [1.23, 14.15,],
... "2001": [4.56, 16.17,],
... "2002": [7.89, 18.19,],
... "2003": [10.11, 20.21,],
... "2004": [12.13, 22.23,],
... }) # Create sample data
>>> df
Country 2000 2001 2002 2003 2004
0 Argentina 1.23 4.56 7.89 10.11 12.13
1 Brazil 14.15 16.17 18.19 20.21 22.23
>>> df.melt(id_vars="Country", var_name="Year", value_name="Value")
Country Year Value
0 Argentina 2000 1.23
1 Brazil 2000 14.15
2 Argentina 2001 4.56
3 Brazil 2001 16.17
4 Argentina 2002 7.89
5 Brazil 2002 18.19
6 Argentina 2003 10.11
7 Brazil 2003 20.21
8 Argentina 2004 12.13
9 Brazil 2004 22.23
>>> # If you want to sort the records by "Country":
>>> df.melt(id_vars="Country", var_name="Year", value_name="Value").sort_values("Country").reset_index(drop=True)
Country Year Value
0 Argentina 2000 1.23
1 Argentina 2001 4.56
2 Argentina 2002 7.89
3 Argentina 2003 10.11
4 Argentina 2004 12.13
5 Brazil 2000 14.15
6 Brazil 2001 16.17
7 Brazil 2002 18.19
8 Brazil 2003 20.21
9 Brazil 2004 22.23
Thanks :)