I've created this df:
## creating multi index multi-level column dataframe
multi_ind = pd.MultiIndex.from_product([['Anna', 'Bob', 'Cara', 'Dan'],['2021', '2022']])
multi_cols = pd.MultiIndex.from_product([['Class XI', 'Class XII'],['Math' , 'Physics', 'Literature']])
data = np.random.randint(20,100, size = (8, 6))
df = pd.DataFrame(data, index =multi_ind, columns = multi_cols)
df.index.names = ['Student','Year']
df.columns.names = ['Class' , 'Subject']
I've sorted this df using sort_index()
df.sort_index(level=[0, 1], ascending=[1, 0], inplace=True)
Now i'm trying to slice into both index and columns to create a subset. I want to slice the outer index and the inner level column
idx = pd.IndexSlice
df.loc[idx['Bob':'Cara', '2021'], idx['Class XII' , 'Math' : 'Physics'] ]
I keep getting this error!
UnsortedIndexError: 'MultiIndex slicing requires the index to be lexsorted: slicing on levels [1], lexsort depth 1'
I am trying to slice both the outer index and the inner column level while choosing a value/list of values for the inner index ad outer column level but it seems I haven't sorted the df correctly.
What am I doing wrong with the lex sort? How do I fix the UnsortedIndexError? I've already tried sort_index()
Also, what is lexsort? Any simple reading material on lexsort?
Thank you so much!!
If sorting Mulitindex in columns it working well, because your code sorting only MulitIndex in index:
df.sort_index(axis=1, inplace=True)
print (df)
idx = pd.IndexSlice
df = df.loc[idx['Bob':'Cara', '2021'], idx['Class XII' , 'Math' : 'Physics'] ]
print (df)
Class Class XII
Subject Math Physics
Student Year
Bob 2021 59 86
Cara 2021 89 84