I wan to extract text between Love
and OK
with following code but it does not work.
document = "This is a document with random words Love apples ornages pears OK some thing Love jeep plane car OK any more Love water cola coffee OK bra bra."
x = re.search("^Love.*OK$", document)
I want to get follwing text: apples ornages pears jeep plane car water cola coffee
We can try using your current regex pattern (modified slightly) eith re.findall
, to find all substring matches. Then, join the resulting array together as a single string.
document = "This is a document with random words Love apples oranges pears OK some thing Love jeep plane car OK any more Love water cola coffee OK bra bra."
matches = re.findall(r'\bLove (.*?) OK\b', document)
print(' '.join(matches))
This prints:
apples oranges pears jeep plane car water cola coffee
Explanation:
The regex pattern \bLove (.*?) OK\b
will capture the content between each Love ... OK
set of markers. This generates, in this case, three substrings. We then join the output array from re.findall
into a single string using join()
.