Search code examples
stringlualua-patterns

Lua: split string into words unless quoted


So I have the following code to split a string between whitespaces:

text = "I am 'the text'"
for string in text:gmatch("%S+") do
    print(string)
end

The result:

I
am
'the
text'

But I need to do this:

I
am
the text --[[yep, without the quotes]]

How can I do this?

Edit: just to complement the question, the idea is to pass parameters from a program to another program. Here is the pull request that I am working, currently in review: https://github.com/mpv-player/mpv/pull/1619


Solution

  • There may be ways to do this with clever parsing, but an alternative way may be to keep track of a simple state and merge fragments based on detection of quoted fragments. Something like this may work:

    local text = [[I "am" 'the text' and "some more text with '" and "escaped \" text"]]
    local spat, epat, buf, quoted = [=[^(['"])]=], [=[(['"])$]=]
    for str in text:gmatch("%S+") do
      local squoted = str:match(spat)
      local equoted = str:match(epat)
      local escaped = str:match([=[(\*)['"]$]=])
      if squoted and not quoted and not equoted then
        buf, quoted = str, squoted
      elseif buf and equoted == quoted and #escaped % 2 == 0 then
        str, buf, quoted = buf .. ' ' .. str, nil, nil
      elseif buf then
        buf = buf .. ' ' .. str
      end
      if not buf then print((str:gsub(spat,""):gsub(epat,""))) end
    end
    if buf then print("Missing matching quote for "..buf) end
    

    This will print:

    I
    am
    the text
    and
    some more text with '
    and
    escaped \" text
    

    Updated to handle mixed and escaped quotes. Updated to remove quotes. Updated to handle quoted words.