How to check all pd.DataFrame for regular expression?

I need check few dataframes. If df do not contain regular expression, I need to clear it. I don't know column there it should be.

How to check all DataFrame for containing regular expression? Without loop to check column?

This is how I do it now:

import pandas as pd
import numpy as np
import re
import codecs

# read file
folder = 'folder_path'
file = 'file_name.html'
html_df = pd.read_html(folder + '/' + file)

# check dataframes
html_match = re.compile(r'_TOM$|_TOD$')
# add DF number with html_match
df_check = []
for i, df in enumerate(html_df):
    for col in df.columns:
        try:
            if len(df[df[col].str.contains(html_match) == True]) != 0:
                df_check.append(i)
            else:
                continue
        except AttributeError:
            continue

Solution

The logic is not fully clear, but if I understand correctly you want to filter the output of read_html (which is a list of DataFrames) to only keep those that contain a specific match:

import numpy as np
import pandas as pd

html_df = [pd.DataFrame([['A', 'B', 'C_TOM'], ['D', 'E', 'F']]),
           pd.DataFrame([['A', 'B', 'C'], ['D', 'E', 'F']]),
           pd.DataFrame([['A', 'B_TOD', 'C'], ['D', 'E', 'F']]),
          ]

out = []

for d in html_df:
    if np.any(d.apply(lambda s: s.str.contains(r'_TOM$|_TOD$'))):
        out.append(d)

Or as a list comprehension:

out = [d for d in html_df
       if np.any(d.apply(lambda s: s.str.contains(r'_TOM$|_TOD$')))]

Output:

[   0  1      2
 0  A  B  C_TOM
 1  D  E      F,
    0      1  2
 0  A  B_TOD  C
 1  D      E  F]