Search code examples
c#regexvb.netuipathuipath-studio

Regex to ignore data after staring point and endpoint


How do we remove or filter data using regex to remove data after an in between ? The starting point is the first date (date could be dynamic it is no the fixed) so for example 08/03/2020 and the endpoint is the last 3 in capslock string (which is also dynamic but only up to 3 characters in capital letters) for example TRU in the string below. And should ignore or remove all the data after that

Here is my current regex :

Regex.Match(text,"(?<=08/03/2020\s+)[\S\s]*?(?=TRU)").Value.Trim

But it aint dynamic .

This is to be remove since this is already after the 08/03/2020 and TRU.

Any idea how we can design a regex for this one ? thank you. #data to be remove

  Processing
       Co-Applicant
       No inquiry records found."

#The String

"08/03/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                     MORTGAGE
   07/08/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   07/08/2020        FCTUALDATA                                                                                       EFX
   07/08/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                     MORTGAGE
   07/07/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                     MORTGAG
   07/07/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   07/07/2020        FCTUALDATA                                                                                       EFX
   05/21/2020        CAP ONE NA                  Bank Credit Card                                                     XPN
   05/21/2020        CAPITAL ONE                 Credit Card                                                          TRU
   05/21/2020        CAPITALONE                  Bank                                                                 EFX
   05/20/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                     MORTGAG
   05/20/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
   05/20/2020        FCTUALDATA                                                                                       EFX
   05/20/2020        FINGERHUT/WEBBANK           Finance Company                                                      XPN
   05/07/2020        EMS                                                                                              EFX
   05/07/2020        GROW FINANCIAL CREDI        Credit Bureau/Mortgage                                               TRU
                                                 Processing
   Co-Applicant
   No inquiry records found."

#Expected output

   "08/03/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                         MORTGAGE
       07/08/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       07/08/2020        FCTUALDATA                                                                                       EFX
       07/08/2020        NOVUS HOME                  Mortgage Company                                                     TRU
                         MORTGAGE
       07/07/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                         MORTGAG
       07/07/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       07/07/2020        FCTUALDATA                                                                                       EFX
       05/21/2020        CAP ONE NA                  Bank Credit Card                                                     XPN
       05/21/2020        CAPITAL ONE                 Credit Card                                                          TRU
       05/21/2020        CAPITALONE                  Bank                                                                 EFX
       05/20/2020        CROSSCOUNTRY                Mortgage Loan                                                        TRU
                         MORTGAG
       05/20/2020        FACTUAL DATA                Mortgage Reporter                                                    XPN
       05/20/2020        FCTUALDATA                                                                                       EFX
       05/20/2020        FINGERHUT/WEBBANK           Finance Company                                                      XPN
       05/07/2020        EMS                                                                                              EFX
       05/07/2020        GROW FINANCIAL CREDI        Credit Bureau/Mortgage                                               TRU

Solution

  • You can use

    (?ms)\A(?:\d{2}/\d{2}/\d{2}(?:\d{2})?|−−DATE−−)\s.*\s\p{Lu}{3}$
    

    See the regex demo

    Details

    • (?ms) - RegexOptions.Multiline (^ matches line start and $ matches line end positions now) and RegexOptions.Singleline (. now also matches newline chars) inline modifers
    • \A - start of a string
    • (?:\d{2}/\d{2}/\d{2}(?:\d{2})?|−−DATE−−) - two digits, /, two digits, / and two or four digits or −−DATE−− string
    • \s - a whitespace
    • .* - any zero or more chars, as many as possible
    • \s - a whitespace
    • [A-Z]{3} - three uppercase ASCII letters (\p{Lu}{3} matches three uppercase letters from any language)
    • $ - end of a line.