Search code examples
pythonpandasdataframemulti-index

How to access columns after creating multiIndex


I am making my DataFrame like this:

influenza_data = pd.DataFrame(data, columns = ['year', 'week', 'weekly_infections'])

and then I create MultiIndex from year and week columns:

influenza_data = influenza_data.set_index(['year', 'week'])

If I have MultiIndex my DataFrame looks like this:

          weekly_infections
year week                  
2009 40                6600
     41                7100
     42                7700
     43                8300
     44                8600
...                     ...
2019 10                8900
     11                6200
     12                5500
     13                3900
     14                3300

and data_influenza.columns:

Index(['weekly_infections'], dtype='object')

The problem I have is that I can't access year and week columns now.

If I try data_influenza['week'] or year I get KeyError: 'week'. I can only do data_influenza.weekly_infections and that returns a whole DataFrame

I know if I remove multiIndex I can easily access them but why can't I data_influenza.year or week with MultiIndex? I specified columns when I was creating Dataframe


Solution

  • As Pandas documentation says here, you can access MultiIndex object levels by get_level_values(index) method:

    influenza_data.index.get_level_values(0)    # year
    influenza_data.index.get_level_values(1)    # week
    

    Obviously, the index parameter represents the order of indices.