lualua-patterns

Trying to get some kind of key:value data from a string in Lua


I'm (again) stuck because patterns... so let's see if with a little of help... The case is I have e. g. a string returned by a function that contains the following:

📄 My Script
ScriptID:RL_SimpleTest
Version:0.0.1
ScriptType:MenuScript
AnotherKey:AnotherValue
And, maybe, some more text...

And I'd want to parse it line by line and should the line contains a ":" get the left side content of the line in a variable (k) and the right content in another one (v), so e. g. I'd have k containing "ScriptID" and v containing "RL_SimpleTest" for the second line (the first one should be just ignored) and so on...

Well, I've started with something like this:

function RL_Test:StringToKeyValue(str, sep1, sep2)
    sep1 = sep1 or "\n"
    sep2 = sep2 or ":"
    local t = {}
    for line in string.gmatch(str, "([^" .. sep1 .. "]+)") do
        print(line)
        for k in string.gmatch(line, "([^" .. sep2 .. "]+)") do --Here is where I'm lost trying to get the key/value pair separately and at the same time...
            --t[k] = v
            print(k)
        end
    end
    return t
end

With the hope once I got isolated the line containing the data in the key:value form that I want to extract, I'd be able to do some kind of for k, v in string.gmatch(line, "([^" .. sep2 .. "]+)") or something so and that way get the two pieces of data, but of course it doesn't work and even though I have a feeling it's a triviality I don't know even where to start, always for the lack of patterns understanding...

Well, I hope at least I exposed it right... Thanks in advance for any help.


Solution

  • I assume every line is of the format k:v, containing exactly one colon, or containing no colon (no k/v pair).

    Then you can simply first match nonempty lines using [^\n]+ (assuming UNIX LF line endings), then match each line using ^([^:]+):([^:]+)$. Breakdown of the second pattern:

    • ^ and $ are anchors. They force the pattern to match the entire line.
    • ([^:]+) matches & captures one or more non-semicolon characters.

    This leaves you with:

    function RL_Test:StringToKeyValue(str)
        local t = {}
        for line in str:gmatch"[^\n]+" do
            local k, v = line:match"^([^:]+):([^:]+)$"
            if k then -- line is k:v pair?
               t[k] = v
            end
        end
        return t
    end
    

    If you want to support Windows CRLF line endings, use for line in (s..'\n'):gmatch'(.-)\r?\n' do as in Piglet's answer for matching the lines instead.

    This answer differs from Piglet's answer in that it uses match instead of gmatch for matching the k/v pairs, allowing exactly one k/v pair with exactly one colon per line, whereas Piglet's code may extract multiple k/v pairs per line.