The following is some code to generate a sample dataframe:
fruits=pd.DataFrame()
fruits['month']=['jan','feb','feb','march','jan','april','april','june','march','march','june','april']
fruits['fruit']=['apple','orange','pear','orange','apple','pear','cherry','pear','orange','cherry','apple','cherry']
ind=fruits.index
ind_mnth=fruits['month'].values
fruits['price']=[30,20,40,25,30 ,45,60,45,25,55,37,60]
fruits_grp = fruits.set_index([ind_mnth, ind],drop=False)
How can I shuffle the outer index randomly and inner index in a different random order in this multi-index data frame?
Assuming this dataframe with MultiIndex as input:
month fruit price
jan 0 jan apple 30
feb 1 feb orange 20
2 feb pear 40
march 3 march orange 25
jan 4 jan apple 30
april 5 april pear 45
6 april cherry 60
june 7 june pear 45
march 8 march orange 25
9 march cherry 55
june 10 june apple 37
april 11 april cherry 60
First shuffle the whole DataFrame, then regroup the months by indexing on a random order:
np.random.seed(0)
idx0 = np.unique(fruits_grp.index.get_level_values(0))
np.random.shuffle(idx0)
fruits_grp.sample(frac=1).loc[idx0]
output:
month fruit price
jan 0 jan apple 30
4 jan apple 30
april 6 april cherry 60
5 april pear 45
11 april cherry 60
feb 1 feb orange 20
2 feb pear 40
june 10 june apple 37
7 june pear 45
march 8 march orange 25
9 march cherry 55
3 march orange 25