Search code examples
pandasipythonmulti-index

Mixed datetime and categorical hierarchical index (multiindex) in pandas


A dataframe df includes columns on df['country'], df['sector'] and df['year'] and there is other numerical data which is mixed int and float. Country and sector are categorical variables, year is datetime64[ns].

I have created a 3-layer hierarchy as follows

arrays1 = [np.array(df['country']), np.array(df['sector']), np.array(df['year'])] 
df1 = df.set_index(arrays1)
df1.index.names = ['country','sector', 'year']
df1 = df1.sort_index()

How should you create this multiindex to ensure the third level year is recognized as a datetimeIndex of frequency annual?


Solution

  • It seems you need:

    df1 = df.set_index(['country','sector','year']).sort_index()
    

    Then you can check level of MultiIndex by get_level_values:

    print (df1.index.get_level_values('year'))