Suppose I have a Dataframe like:
import pandas as pd
df = pd.DataFrame({'foo': [1, 2, 3], 'bar': [4, 5, 6], 'ber': [7, 8, 9]})
Given a list of "filter" strings like mylist = ['oo', 'ba']
, how can I select all columns in df
whose name partially match any of the strings in mylist
? For this example, the expected output is {'foo': [1, 2, 3], 'bar': [4, 5, 6]}
.
You can use df.filter
with regex
to do that.
import pandas as pd
# sample dataframe
df = pd.DataFrame({'foo': [1, 2, 3], 'bar': [4, 5, 6], 'ber': [7, 8, 9]})
# sample list of strings
mylist = ['oo', 'ba']
# join the list to a single string
matches = '|'.join(mylist)
# use regex to filter the columns based on the string
df_out = df.filter(regex=matches)