Search code examples
regexuipath

Regex Get Everything Between 2 Words


I need to find in text the last match word "Madhuparna" and get text between the left near match tag html to last tag html in text.

  • Word to find:

Madhuparna

  • Text input:

<p>The entire purpose speed up the process.</p><p>June 5, 2021 By Demo</p>\r\n<p>The entire purpose of a terminal emulator is to imitate how the regular computer terminals perform</p><p>Allowing the main computer to connect Madhuparna to and use a remote computer</p><li>bla bla bla bla bla bla</li>

  • Result that I need:

<p>Allowing the main computer to connect Madhuparna to and use a remote computer</p><li>bla bla bla bla bla bla</li>

  • What I have to now but not work:

/<(\S+)(>| .*?>)[^<>]*Madhuparna[^<>]*<\/\1>/g


Solution

  • You can use

    (?s)<\w+(?:\s[^>]*)?>[^<>]*Madhuparna.*</\w+>
    

    See the regex demo. Details:

    • (?s) - an inline singleline flag
    • < - < char
    • \w+ - one or more word chars
    • (?:\s[^>]*)? - an optional occurrence of a whitespace and then zero or more chars other than >
    • > - a > char
    • [^<>]* - zero or more chars other than < and >
    • Madhuparna - a substring
    • .* - any zero or more chars, as many as possible
    • </\w+> - a </ string, any one or more word chars, >.