Search code examples
regexlua

Regex split string by two consecutive pipe ||


I want to split below string by two pipe(|| ) regex .

Input String

value1=data1||value2=da|ta2||value3=test&user01|

Expected Output

value1=data1
value2=da|ta2
value3=test&user01|

I tried ([^||]+) but its consider single pipe | also to split .

Try out my example - Regex

value2 has single pipe it should not be considered as matching.

I am using lua script like

for pair in string.gmatch(params, "([^||]+)") do 
 print(pair) 
end

Solution

  • You can explicitly find each ||.

    $ cat foo.lua
    s = 'value1=data1||value2=da|ta2||value3=test&user01|'
    
    offset = 1
    for idx in string.gmatch(s, '()||') do
        print(string.sub(s, offset, idx - 1) )
        offset = idx + 2
    end
    -- Deal with the part after the right-most `||`.
    -- Must +1 or it'll fail to handle s like "a=b||".
    if offset <= #s + 1 then
        print(string.sub(s, offset) )
    end
    $ lua foo.lua
    value1=data1
    value2=da|ta2
    value3=test&user01|
    

    Regarding ()|| see Lua's doc about Patterns (Lua does not have regex support)

    • Captures:

      A pattern can contain sub-patterns enclosed in parentheses; they describe captures. When a match succeeds, the substrings of the subject string that match captures are stored (captured) for future use. Captures are numbered according to their left parentheses. For instance, in the pattern "(a*(.)%w(%s*))", the part of the string matching "a*(.)%w(%s*)" is stored as the first capture, and therefore has number 1; the character matching "." is captured with number 2, and the part matching "%s*" has number 3.

      As a special case, the capture () captures the current string position (a number). For instance, if we apply the pattern "()aa()" on the string "flaaap", there will be two captures: 3 and 5.