Search code examples
luastring-matchingfindinfiles

lua match everything after a tag in a string


The string is like this: TEMPLATES="!$TEMPLATE templatename manufacturer model mode\n$TEMPLATE MacQuantum Wash Basic\n$$MANUFACTURER Martin\n$$MODELNAME Mac Quantum Wash\n$$MODENAME Basic\n"

My way to get strings without tags is:

    local sentence=""
    for word in string.gmatch(line,"%S+") do
      if word ~= tag then
        sentence=sentence .. word.." "
      end              
    end
    table.insert(tagValues, sentence)
    E(tag .." --> "..sentence)

And I get output:

$$MANUFACTURER --> Martin 
$$MODELNAME --> Mac Quantum Wash 
... 
...

But this is not the way I like. I would like to find first the block starting with $TEMPLATE tag to check if this is the right block. There is many such blocks in a file I read line by line. Then I have to get all tags marked with double $: $$MODELNAME etc. I have tried it on many ways, but none satisfied me. Perhaps someone has an idea how to solve it?


Solution

  • We are going to use Lua patterns (like regex, but different) inside a function string.gmatch, which creates a loop. Explanation: for match in string.gmatch(string, pattern) do print(match) end is an iterative function that will iterate over every instance of pattern in string. The pattern I will use is %$+%w+%s[^\n]+

    %$+ - At least 1 literal $ ($ is a special character so it needs the % to escape), + means 1 or more. You could match for just one ("%$") if you only need the data of the tag but we want information on how many $ there are so we'll leave that in.

    %w+ - match any alphanumeric character, as many as appear in a row.

    %s - match a single space character

    [^\n]+ - match anything that isn't '\n' (^ means invert), as many as appear in a row. Once the function hits a \n, it executes the loop on the match and repeats the process.

    That leaves us with strings like "$TEMPLATE templatename manufacturer" We want to extract the $TEMPLATE to its own variable to verify it, so we use string.match(string, pattern) to just return the value found by the pattern in string.

    OK: EDIT: Here's a comprehensive example that should provide everything you're looking for.

    templates = "!$TEMPLATE templatename manufacturer model mode\n$TEMPLATE MacQuantum Wash Basic\n$$MANUFACTURER Martin\n$$MODELNAME Mac Quantum Wash\n$$MODENAME Basic\n"
    
    local data = {}
    for match in string.gmatch(templates, "%$+%w+%s[^\n]+") do --finds the pattern given in the variable 'templates'
      --this function assigns certain data to tags inside table t, which goes inside data.
     local t = {}
     t.tag = string.match(match, '%w+')  --the tag (stuff that comes between a $ and a space)
     t.info = string.gsub(match, '%$+%w+%s', "") --value of the tag (stuff that comes after the `$TEMPLATE `. Explanation: %$+ one or more dollar signs $w+ one or more alphanumeric characters $s a space. Replace with "" (erase it)
     _, t.ds = string.gsub(match, '%$', "") --This function emits two values, the first one is garbage and we don't need (hence a blank variable, _). The second is the number of $s in the string).
     table.insert(data, t)
    end
    for _,tag in pairs(data) do     --iterate over every table of data in data.
     for key, value in pairs(tag) do
      print("Key:", key, "Value:", value) --this will show you data examples (see output)
     end
     print("-------------")
    end
    
    print('--just print the stuff with two dollar signs')
    for key, data in pairs(data) do
     if data.ds == 2 then --'data' becomes a subtable in table 'data', we evaluate how many dollar signs it recognized.
      print(data.tag)
     end
    end
    
    print("--just print the MODELNAME tag's value")
    for key, data in pairs(data) do
     if data.tag == "MODELNAME" then --evaluate the tag name.
      print(data.info)
     end
    end
    

    Output:

    Key:    info    Value:  templatename manufacturer model mode
    Key:    ds  Value:  1
    Key:    tag Value:  TEMPLATE
    -------------
    Key:    info    Value:  MacQuantum Wash Basic
    Key:    ds  Value:  1
    Key:    tag Value:  TEMPLATE
    -------------
    Key:    info    Value:  Martin
    Key:    ds  Value:  2
    Key:    tag Value:  MANUFACTURER
    -------------
    Key:    info    Value:  Mac Quantum Wash
    Key:    ds  Value:  2
    Key:    tag Value:  MODELNAME
    -------------
    Key:    info    Value:  Basic
    Key:    ds  Value:  2
    Key:    tag Value:  MODENAME
    -------------
    --just print the stuff with two dollar signs
    MANUFACTURER
    MODELNAME
    MODENAME
    --just print the MODELNAME tag's value:
    Mac Quantum Wash