Search code examples
pythonpandasdataframemulti-index

Python : get access to a column of a dataframe with multiindex


Let's say that I have this dataframe with multi index :

                                           Montant
IBAN                  Date       Balance
FR3724687328623865    2020-09-16 654.75      -2.00
                      2020-09-17 23.65      -88.00
                      2020-09-21 1537.00   2700.20
                      2020-09-25 8346.20   -163.21
                      2020-09-28 6247.60   -468.90
...                                            ...
FR8723498262347632    2020-10-06 13684.11  2708.00
FR9687234782365235    2020-10-16 4353.42   6311.00
                      2020-10-28 9641.23    562.78
                      2020-11-30 5436.95    -45.12
                      2020-09-30 4535.34    -43.56

How do we get access to the data in the columns "Balance" or "Date", I do not get why that does not work :

bal = df["Montant"]["Balance"]

or

bal = df.loc[("Montant", "Balance")]

Solution

  • You should use Index.get_level_values:

    In [505]: df
    Out[505]: 
                                            Montant
    IBAN               Date       Balance          
    FR3724687328623865 2020-09-16  654.75      -2.0
    2020-09-17         23.65      -88.00        NaN
    2020-09-21         1537.00     2700.20      NaN
    2020-09-25         8346.20    -163.21       NaN
    2020-09-28         6247.60    -468.90       NaN
    

    You can pass labels :

    In [509]: df.index.get_level_values('Date')
    Out[509]: Index(['2020-09-16', '23.65', '1537.00', '8346.20', '6247.60'], dtype='object', name='Date')
    
    In [510]: df.index.get_level_values('Balance')
    Out[510]: Float64Index([654.75, -88.0, 2700.2, -163.21, -468.9], dtype='float64', name='Balance')
    

    OR:

    Pass indices:

    In [512]: df.index.get_level_values(1)
    Out[512]: Index(['2020-09-16', '23.65', '1537.00', '8346.20', '6247.60'], dtype='object', name='Date')
    
    In [513]: df.index.get_level_values(2)
    Out[513]: Float64Index([654.75, -88.0, 2700.2, -163.21, -468.9], dtype='float64', name='Balance')