I'm having a dataframe as shown below, and would like to move the values from 'phone', 'spotify' and 'rent' from bottom half and override the top half (essentially splitting the dataframe in two and placing the 'expense' values to the 'income'-half.
Currently, there are january thorugh december twice. I want it to just be 12 rows with values in each cell (ie no cell having 0.0 as value).
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 0.0 0.0 0.0
february income 1200.0 13000.0 2000.0 0.0 0.0 0.0
march income 1200.0 13000.0 2000.0 0.0 0.0 0.0
april income 1200.0 13000.0 2000.0 0.0 0.0 0.0
may income 1200.0 13000.0 2000.0 0.0 0.0 0.0
june income 1200.0 13000.0 2000.0 0.0 0.0 0.0
july income 1200.0 13000.0 2000.0 0.0 0.0 0.0
august income 1200.0 13000.0 2000.0 0.0 0.0 0.0
september income 1200.0 13000.0 2000.0 0.0 0.0 0.0
october income 1200.0 13000.0 2000.0 0.0 0.0 0.0
november income 1200.0 13000.0 2000.0 0.0 0.0 0.0
december income 1200.0 13000.0 2000.0 0.0 0.0 0.0
january expense 0.0 0.0 0.0 300.0 49.0 3500.0
february expense 0.0 0.0 0.0 300.0 149.0 3500.0
march expense 0.0 0.0 0.0 300.0 49.0 3500.0
april expense 0.0 0.0 0.0 300.0 49.0 3500.0
may expense 0.0 0.0 0.0 300.0 49.0 3500.0
june expense 0.0 0.0 0.0 300.0 49.0 3500.0
july expense 0.0 0.0 0.0 300.0 49.0 3500.0
august expense 0.0 0.0 0.0 300.0 49.0 3500.0
september expense 0.0 0.0 0.0 300.0 49.0 3500.0
october expense 0.0 0.0 0.0 300.0 49.0 3500.0
november expense 0.0 0.0 0.0 300.0 49.0 3500.0
december expense 0.0 0.0 0.0 300.0 49.0 3500.0
Getting data from .JSON:
df_all = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
.JSON file structure:
{
"january": {
"income": {
"loan": 1200,
"csn": 13000,
"salary": 2000
},
"expense": {
"phone": 300,
"spotify": 49,
"rent": 3500
}
...
Desired output:
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
february income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
march income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
april income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
may income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
june income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
july income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
august income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
september income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
october income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
november income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
december income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
Here is one way:
df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
df
Output:
loan csn Salary phone spotify rent
Apr income 1200 13000 2000.0 300 49 3500
Aug income 1200 13000 2000.0 300 49 3500
Dec income 1200 13000 2000.0 300 49 3500
Feb income 1200 13000 2000.0 300 49 3500
Jan income 1200 13000 2000.0 300 49 3500
Jul income 1200 13000 2000.0 300 49 3500
Jun income 1200 13000 2000.0 300 49 3500
Mar income 1200 13000 2000.0 300 49 3500
May income 1200 13000 2000.0 300 49 3500
Nov income 1200 13000 2000.0 300 49 3500
Oct income 1200 13000 2000.0 300 49 3500
Sep income 1200 13000 2000.0 300 49 3500
Details:
Rename index level 1 such that 'expense' becomes 'income', then use groupby
both levels of the index. We could use first
but I don't think that future proof and safe, therefore, I choose to fillna
with zero and sum
.