Search code examples
regexrubyregex-greedy

Ruby non-greedy modifier did not apply?


I have a regexp with a non-greedy modifier which does not seem to work. I have tried so many variations of the regexp and various other ways I could think of, without success, that I am losing my head

I want to remove all the empty strings embedded in the string s below. With my regexp I was expecting to remove all the things that matched something=""

s = 'a,b="cde",f="",g="hi",j=""'

puts s; puts s.gsub( /,.+?="",?/ , "," ).chomp(','); nil

Expected:

a,b="cde",g="hi"

What I get:

a,g="hi"

Why isn't the .+? non greedy in the gsub regexp below?

It works if I constrain the . to a set of characters [\w\d_-], but that forces me to do assumptions:

puts s; puts s.gsub( /,[\w\d_-]+?=""/ , "" ).chomp(','); nil

# outputs:
a,b="cde",f="",g="hi",j=""
a,g="hi"

It also works if I do some sort of negative lookup like:

puts s; puts s.gsub( /,.+?="",?/ , "," ).chomp(','); nil

# outputs:
a,b="cde",f="",g="hi",j=""
a,g="hi"

But still I do not understand why it did not work in the first case.


Solution

  • Regex matches from left to right. Your regex ,.+?="",? matches the first comma in the string a,b="cde",f="",g="hi",j="", the one between a and b. Then it tries to find ="" that exists after the ,g so you get the actual result.

    What you want is: ,[^=]+?="",? that matches 1 or more any character that is not an equal sign before ="" and you'll get a,b="cde",g="hi" as result.