I am looking through a docx document using docx module and regex.
I have found the text immediately before the string I actually want to extract. How can I reference next string? Can I use the Index at all?
for table in wordDoc.tables:
for row in table.rows:
for cell in row.cells:
#grabbing the Payment Total Amount
if 'Total Payment Amount:' in cell.text:
print(cell.text)
print(cell.text.index)
Output:
Total Payment Amount:
<built-in method index of str object at 0x000001F9376D26C0>
Something like this should give you the idea:
>>> text = "The quick brown fox"
>>> key = "quick"
>>> start = text.index(key)
>>> start
4
>>> text[start:]
'quick brown fox'
>>> text[start+len(key):]
' brown fox'
A few of the finer points:
.index()
is a method, not a property, so you need to give it the key value you're after.
`.index() give you the starting offset of the key within the string, you need to add the length of the key to locate the suffix.
"Slicing" a string to get a suffix is accomplished with an open-ended range (e.g. s[n:]
). Search on python string slice
to find more on how that works.
You may need to account for spaces between words. Using the .lstrip()
method is probably best for that since it works for no spaces, one space, or multiple spaces.
>>> text[start+len(key):].lstrip()
'brown fox'