Search code examples
pythonpandasoverridingmove

Override/move values from bottom rows to upper rows in specific columns (pandas)


I'm having a dataframe as shown below, and would like to move the values from 'phone', 'spotify' and 'rent' from bottom half and override the top half (essentially splitting the dataframe in two and placing the 'expense' values to the 'income'-half.

Currently, there are january thorugh december twice. I want it to just be 12 rows with values in each cell (ie no cell having 0.0 as value).

                    loan      csn  salary  phone  spotify    rent
january   income   1200.0  13000.0  2000.0    0.0      0.0     0.0
february  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
march     income   1200.0  13000.0  2000.0    0.0      0.0     0.0
april     income   1200.0  13000.0  2000.0    0.0      0.0     0.0
may       income   1200.0  13000.0  2000.0    0.0      0.0     0.0
june      income   1200.0  13000.0  2000.0    0.0      0.0     0.0
july      income   1200.0  13000.0  2000.0    0.0      0.0     0.0
august    income   1200.0  13000.0  2000.0    0.0      0.0     0.0
september income   1200.0  13000.0  2000.0    0.0      0.0     0.0
october   income   1200.0  13000.0  2000.0    0.0      0.0     0.0
november  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
december  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
january   expense     0.0      0.0     0.0  300.0     49.0  3500.0
february  expense     0.0      0.0     0.0  300.0    149.0  3500.0
march     expense     0.0      0.0     0.0  300.0     49.0  3500.0
april     expense     0.0      0.0     0.0  300.0     49.0  3500.0
may       expense     0.0      0.0     0.0  300.0     49.0  3500.0
june      expense     0.0      0.0     0.0  300.0     49.0  3500.0
july      expense     0.0      0.0     0.0  300.0     49.0  3500.0
august    expense     0.0      0.0     0.0  300.0     49.0  3500.0
september expense     0.0      0.0     0.0  300.0     49.0  3500.0
october   expense     0.0      0.0     0.0  300.0     49.0  3500.0
november  expense     0.0      0.0     0.0  300.0     49.0  3500.0
december  expense     0.0      0.0     0.0  300.0     49.0  3500.0

Getting data from .JSON:

        df_all = pd.DataFrame.from_dict({(i,j): data[i][j] 
                           for i in data.keys() 
                           for j in data[i].keys()},
                       orient='index')

.JSON file structure:

{
    "january": {
        "income": {
            "loan": 1200,
            "csn": 13000,
            "salary": 2000
        },
        "expense": {
            "phone": 300,
            "spotify": 49,
            "rent": 3500
        }
        ...

Desired output:

                    loan      csn  salary  phone  spotify    rent
january   income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
february  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
march     income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
april     income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
may       income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
june      income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
july      income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
august    income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
september income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
october   income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
november  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
december  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0

Solution

  • Here is one way:

    df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
    df
    

    Output:

                loan    csn  Salary  phone  spotify  rent
    Apr income  1200  13000  2000.0    300       49  3500
    Aug income  1200  13000  2000.0    300       49  3500
    Dec income  1200  13000  2000.0    300       49  3500
    Feb income  1200  13000  2000.0    300       49  3500
    Jan income  1200  13000  2000.0    300       49  3500
    Jul income  1200  13000  2000.0    300       49  3500
    Jun income  1200  13000  2000.0    300       49  3500
    Mar income  1200  13000  2000.0    300       49  3500
    May income  1200  13000  2000.0    300       49  3500
    Nov income  1200  13000  2000.0    300       49  3500
    Oct income  1200  13000  2000.0    300       49  3500
    Sep income  1200  13000  2000.0    300       49  3500
    

    Details:

    Rename index level 1 such that 'expense' becomes 'income', then use groupby both levels of the index. We could use first but I don't think that future proof and safe, therefore, I choose to fillna with zero and sum.