How to search for text across multiple rows in a pandas dataframe?

So I'm quite new to Python, and I was just wondering if it is possible for me to use it in order to search for text across multiple rows. Here is a screenshot of my dataframe:

https://i.sstatic.net/jeqpv.png

To make it clearer, what I would like to do is search for phrases or expressions containing more than one word, such as 'New Jersey,' however, each word makes up a separate row so I do not know how to go about including more than one row in the query. I would also, if possible, like to create a new column which will label any matches with 'M' and those without 'N.' All help is appreciated to make this easier for me!

Solution

The idea is to join all rows to be able to search multiple continuous words.

For example, we want to find the phrase "she wants to" in whole dataframe:

>>> df
   subtitle
0       She  # <- start here (1)
1     wants  #
2        to  # <- end here (1)
3      sing
4       she  # <- start here (2)
5     wants  #
6        to  # <- end here (2)
7       act
8       she  # <- start here (3)
9     wants  # 
10       to  # <- end here (3)
11    dance

import re

search = "she wants to"
text = " ".join(df["subtitle"])

# index of start / end position of the word in text
end = df["subtitle"].apply(len).cumsum() + pd.RangeIndex(len(df))
start = end.shift(fill_value=-1) + 1

# create additional columns
df["start"] = start.tolist()
df["end"] = end.tolist()
df["match"] = False

# find all iteration of the search text
for match in re.finditer(search, text, re.IGNORECASE):
    idx1 = df[df["start"] == match.start()].index[0]
    idx2 = df[df["end"] == match.end()].index[0]
    df.loc[idx1:idx2, "match"] = True

>>> df
   subtitle  start  end  match
0       She      0    3   True
1     wants      4    9   True
2        to     10   12   True
3      sing     13   17  False
4       she     18   21   True
5     wants     22   27   True
6        to     28   30   True
7       act     31   34  False
8       she     35   38   True
9     wants     39   44   True
10       to     45   47   True
11    dance     48   53  False

Update: search for multiple terms:

Change only:

# search = "she wants to"
search = ["she wants to", "if you", "I will"]
search = fr"({'|'.join(search)})"

# df = pd.DataFrame({'subtitle': ['She', 'wants', 'to', 'sing', 'she', 'wants', 'to', 'act', 'she', 'wants', 'to', 'dance', 'If', 'you', 'sing', 'I', 'will', 'smile', 'if', 'you', 'laugh', 'I', 'will', 'smile', 'if', 'you', 'love', 'I', 'will', 'smile']})
>>> df
   subtitle  start  end  match
0       She      0    3   True
1     wants      4    9   True
2        to     10   12   True
3      sing     13   17  False
4       she     18   21   True
5     wants     22   27   True
6        to     28   30   True
7       act     31   34  False
8       she     35   38   True
9     wants     39   44   True
10       to     45   47   True
11    dance     48   53  False
12       If     54   56   True
13      you     57   60   True
14     sing     61   65  False
15        I     66   67   True
16     will     68   72   True
17    smile     73   78  False
18       if     79   81   True
19      you     82   85   True
20    laugh     86   91  False
21        I     92   93   True
22     will     94   98   True
23    smile     99  104  False
24       if    105  107   True
25      you    108  111   True
26     love    112  116  False
27        I    117  118   True
28     will    119  123   True
29    smile    124  129  False

Update 2: terms into text file:

$ cat terms.txt
she wants to
if you
I will

search = [term.strip() for term in open("terms.txt").readlines()]
search = fr"({'|'.join(search)})"