Search code examples
regexgoogle-sheetsgoogle-sheets-formulare2

How to make Google Sheets REGEXTRACT RegEx function do "greedy" pattern matching


I'm trying to build a pattern to select "1.2." from "1.2.3" (i.e. select every digit and dot before the last digit group) to work in REGEXEXTRACT sheet function in Google Sheets.

I came up with the pattern ^([0-9]+\.)+ which presumably corresponds to both Perl and Google/re2 specification. So, my code is:

=REGEXEXTRACT("1.2.3";"^([0-9]+\.)+")

I tested it on https://regexr.com/ where it returns "1.2." and on https://regex101.com/ where it returns "1.2." as the whole result and "2." as group 1.

But on Google Sheets, it returns "2." only.

I think that it is connected to either:

  1. "ungreedy" evaluation by REGEXEXTRACT (which is contrary to Google/re2 specification;
  2. the above plus it returns only the first group;

I tried a way-around, so to make good of item 2 above. So, I changed the pattern to (^([0-9]+\.)+). I tested it on https://regexr.com/ where it returns "1.2." and on https://regex101.com/ where it returns "1.2." as the whole result and as group 1. Unfortunately, it triggers #REF! error in Google Sheets.

So, my questions - what is the way to make greedy selection of the longest possible substring in Google Sheets or a way-around.


Solution

  • See if this helps

    =regexextract("1.2.3","^([0-9.]+)[^\.]+$")
    

    or, reference the cell with the string "1.2.3"

    =regexextract(A2, "^([0-9.]+)[^\.]+$")