Search code examples
regexregex-lookaroundsregex-negationregex-groupregex-greedy

Regex: How to capture one set of parenthesis, but not the next


I have the following data.

Nike.com (Nike) (Apparel)
Adidas.com (Adidas) (Footwear)
Under Armour (Accessories)
Lululemon (Apparel)

I am trying to capture the company name, but not the type of product. Specifically, I want to capture

Nike.com (Nike)
Adidas.com (Adidas) 
Under Armour
Lululemon

Using this RegEx:

(.+? \(.+?\))

I get the following:

Nike.com (Nike)
Adidas.com (Adidas)
Under Armour (Accessories)
Lululemon (Apparel)

This works for Nike and Adidas, but it doesn't work for Under Armour or Lululemon. The type of product will always be at the end of the line. I've tried the following with no success:

(.+? \(.+?\)(?!Accessories|Apparel|Footwear))

(.+? \(.+?\)(?!.*Accessories|.*Apparel|.*Footwear).*)

Solution

  • You seem to want to get all up to the parenthesized substring at the end of string.

    You may use

    ^(.+?) *\([^()]+\)$
    

    See the regex demo

    Details

    • ^ - start of string
    • (.+?) - Group 1: any one or more chars other than line break chars, as few as possible
    • * - zero or more spaces
    • \( - a ( char
    • [^()]+ - 1+ chars other than ( and )
    • \) - a ) char
    • $ - end of string.