Provided I have a multiindex data Frame as follows:
import pandas as pd
import pandas as pd
import numpy as np
input_id = np.array(['input_id'])
docType = np.array(['pre','pub','app','dw'])
docId = np.array(['34455667'])
sec_type = np.array(['bib','abs','cl','de'])
sec_ids = np.array(['x-y','z-k'])
index = pd.MultiIndex.from_product([input_id,docType,docId,sec_type,sec_ids])
content= [str(np.random.randint(1,10))+ '##' + str(np.random.randint(1,10)) for i in range(len(index))]
df = pd.DataFrame(content, index=index, columns=['content'])
df.rename_axis(index=['input_id','docType','docId','secType','sec_ids'], inplace=True)
I would like to query the multiindex DF
# query a multiindex DF
idx = pd.IndexSlice
df.loc[idx[:,'pub',:,'de',:]]
Resulting in:
I would like to get directly the values of the multiindex column sec_ids as a list. How do I have to modify to get the follwoing result:
['x-y','z-k']
Thanks
You can use the MultiIndex.get_level_values()
method to get the values of a specific level of a MultiIndex. So in this case call it after your slice.
df.loc[idx[:,'pub',:,'de',:]].index.get_level_values('sec_ids').tolist()
#['x-y', 'z-k']