I need to count how many occurrences are contained into a string, but check for multiple chars, the following one (GSM 3.38 extended chars):
€
[
\
]
^
{
|
}
~
E.g. given string: abc€|€]/{def
The number I need is 6
With a single char I used:
local _, c = addr:gsub("€","")
So that c = 2
And it works perfect. Could someone drive me to implement the count of multiple occurrences in the string?
local _, c = addr:gsub("€","")
will give that c = 2
but if we create a group [€]
it gives c = 6
because €
is a three bytes wide, Lua sees it as a string of 3 characters each in that group You can also see this by changing €
to \226\130\172
in both examples.
to get an accurate count of the characters as you described it you can to do 2 separate gsub
with 2 difference patterns one for chars with 3 bytes and one for single bytes:
local str = "abc€|€]/{def"
local tripleByteCharPattern = "€"
local singleByteCharpattern = "[%[%]/^{|}~]"
local count = select(2, str:gsub(singleByteCharpattern, ""))
count = count + select(2, str:gsub(tripleByteCharPattern, ""))
print(count)
1 thing to note with this method if there could be other types of multi byte width characters you could end up identifying one of your single byte char INSIDE the multi byte character, the ways around this would usually require identifying the start of the multi byte characters.