I have df which looks like this:
print df_raw
Name exp1
Name
UnweightedBase 1364
Base 1349
BFC_q5a1 34.18%
BFC_q5a2 2.93%
BFC_q5a3 1.86%
BFC_q5a4 1.93%
BFC_q5a5 0.84%
I want to build subset from the dataframe above using row labels however, I was like to use re.IGNORECASE but I'm not sure how.
without re.IGNORECASE the code looks like this:
subset_df = df_raw.loc[df_raw.index.isin(['BFC_q5a4', 'BFC_q5a5'])]
How can I change my code to make use of re.IGNORECASE for the code below:
subset_df = df_raw.loc[df_raw.index.isin(['bFc_q5A4', 'BfC_Q5a5'])]
note - I don't want to use str.lower or str.upper to do this.
Thanks!
I don't know of any neat way to search index labels in a case-insensitive way (df.filter
is useful but doesn't appear to be able to ignore case unfortunately).
To get around this, you could make use of the series method pd.Series.str.contains
which can ignore case:
subset_df = df[pd.Series(df.index).str.contains(regex, case=False).values]
The index is turned in a Series and then regex matching is applied. regex
in this case could be something like 'bFc_q5A4|BfC_Q5a5'
. Case is ignored (using case=False
).