Search code examples
c#regexvb.netuipath

Regex to extract data in between string


Ive tried the ff regex below but it does not seem to work. I wanted to extract data between F. Prepaids and G. Initial Escrow Payment and get the ff sample result below. Thanks.

#my regex

(?<=F. Prepaids)[\S\s]*?(?= G. Initial Escrow Payment)

#String

F. Prepaids $887.01
01 Homeowner's Insurance Premium ( 12 mo.) toAmerican Family  $893.00
Insura
02 Mortgage Insurance Premium (     mo.)
03 Prepaid Interest ($5.99 per day from 10/02/2020 to 10/01/2020) -$5.99
04 Property Taxes (     mo.)
05
06
07
08
G. Initial Escrow Payment at Closing $3,776.11

If i got the data in between I also want a regex to get the ff result which other data includes new lines based on the strin above.

Homeowner's Insurance Premium ( 12 mo.) to American Family Insura
Mortgage Insurance Premium ( mo.)
Prepaid Interest ($5.99 per day from 10/02/2020 to 10/01/2020)
Property Taxes (     mo.)

Any idea with this one ? Thnk you.


Solution

  • You may use

    (?m)(?<=F\. Prepaids[\s\S]*?^\d+ )[^\r\n]+(?:\r?\n[^\n\d][^\r\n]*)?(?=[\s\S]*?\nG\. Initial Escrow Payment)
    

    See the regex demo

    Details

    • (?m) - multiline mode on
    • (?<=F\. Prepaids[\s\S]*?^\d+ ) - match a location immediately preceded with F. Prepaids, then any zero or more chars as few as possible, then 1+ digits at the start of a line and then a space
    • [^\r\n]+ - any one or more chars other than CR and LF and
    • (?:\r?\n[^\n\d][^\r\n]*)* - zero or more sequences of CRLF or LF ending, any non-digit and non-newline char and then any zero or more chars other than a newline and carriage return
    • (?=[\s\S]*?\nG\. Initial Escrow Payment) - the current location must be followed with
      • [\s\S]*? - any zero or more chars as few as possible
      • \n - a newline
      • G\. Initial Escrow Payment - a G. Initial Escrow Payment text.