Search code examples
regexgoogle-sheets-formularegexp-replace

Removing everything between 2 strings with Google sheets RE2


I'm trying to remove something from a product title as part of a Google sheet

  • Example Johner Gladstone Pinot Noir 2015, 75CL
  • Stella Artois Premium Lager Bottle, 1 X 660 Ml
  • Pepesza Ppsh-40 Vodka Tommy Gun, 1 L

And I want to be able to remove everything from the , and either the CL, ML or L.

The problem I'm running into is that I don't know enough about regex and I'm struggling to find a good place to learn!

What I've tried so far is below

  • =REGEXREPLACE(A2,"[, ]\QML|CL\E","")

    but this doesn't work and I think its because [, ] isn't a valid part.

  • =REGEXREPLACE(A2,"\*\QML|CL\E","")

    because I know that , is the only punctuation in the titles - I've also tried this but not been successful.


Solution

  • What you are trying to get is

    (?i), .*?[CM]?L
    

    See the regex demo. Details:

    • (?i) - case insensitive flag
    • , .*? - comma, space, and then any zero or more chars other than line break chars, as few as possible (due to *?, if you need as many as possible use * instead)
    • [CM]?L - C or M (optionally due to ?) and then an L char.

    However, you can simply match from a , + space till the end of the line:

    ", .*
    

    See this regex demo. Here, the first comma+space is matched and then the rest of the string (line, since . does not match line breaks by default).

    See the regular expression syntax accepted by RE2.