Search code examples
regexuipath

How can get specific content before and after a Key word using Regex


I have the data:


P.C 115 P.B 372 Page 2 of 2

Subscriber Number 123456
Service Details Bill Period
SingleBill Educ Plans-Schools College 500Mb 500MO From 01/11/2021 To 30/11/2021
Static IP 1 From 01/11/2021 To 30/11/2021
Local Only From 01/11/2021 To 30/11/2021
Fixed Line Provisioning From 01/11/2021 To 30/11/2021
Discounts
Static IP 100% Rental Discount

Subscriber Number 763848
Service Details Bill Period
SingleBill Educ Plans-Schools College 300Mb 200AB From 01/11/2021 To 30/11/2021
Fixed Line Provisioning From 01/11/2021 To 30/11/2021


I want to get the "Subscriber Number" and corresponding "Discount" for each Subscriber, where "Discount" is available. Is there any possible way to do it using Regex.

I'm using PDF activities in the UiPath to read the text from PDF.

That 'Read PDF' activity is returning a String.

Then, I'm trying to write the regex to get the Subscriber Number and Discount description, for which Discount is eligible using look ahead and look behind in regex.

I am trying (?<=Subscriber Number)(.*)(?=\n) and I'm able to capture the Subscriber Number, but not the text in between Subscriber Number and newline.


Solution

  • You can capture both values with

    (?m)^Subscriber\s+Number\s+(\d+)(?:\r?\n(?!Discounts).+)*\r?\nDiscounts\s+(.+)
    

    See the regex demo. Details:

    • (?m)^ - start of a line
    • Subscriber\s+Number\s+ - Subscriber, one or more whitespaces, Number, one or more whitespaces
    • (\d+) - Group 1: one or more digits
    • (?:\r?\n(?!Discounts).+)* - any zero or more repetitions of
      • \r?\n - an optional carriage return and then a line feed char
      • (?!Discounts).+ - a non-empty line that does not start with Discounts
    • \r?\n - an optional carriage return and then a line feed char
    • Discounts - a Discounts string
    • \s+ - one or more whitespaces
    • (.+) - Group 2: any one or more chars other than a line feed char.