Search code examples
lua

Lua - count multiple occurrences in a string


I need to count how many occurrences are contained into a string, but check for multiple chars, the following one (GSM 3.38 extended chars):

€
[
\
]
^
{
|
}
~

E.g. given string: abc€|€]/{def The number I need is 6

With a single char I used:

local _, c = addr:gsub("€","")

So that c = 2

And it works perfect. Could someone drive me to implement the count of multiple occurrences in the string?


Solution

  • local _, c = addr:gsub("€","") will give that c = 2 but if we create a group [€] it gives c = 6 because is a three bytes wide, Lua sees it as a string of 3 characters each in that group You can also see this by changing to \226\130\172 in both examples.

    to get an accurate count of the characters as you described it you can to do 2 separate gsub with 2 difference patterns one for chars with 3 bytes and one for single bytes:

    local str = "abc€|€]/{def"  
    local tripleByteCharPattern = "€"
    local singleByteCharpattern = "[%[%]/^{|}~]"  
    local count = select(2, str:gsub(singleByteCharpattern, ""))  
    count = count + select(2, str:gsub(tripleByteCharPattern, ""))
      
    print(count)
    

    1 thing to note with this method if there could be other types of multi byte width characters you could end up identifying one of your single byte char INSIDE the multi byte character, the ways around this would usually require identifying the start of the multi byte characters.