I am trying to create a word-set (class) instead of the char-set (class) in Lua.
For Example:
local text = "hello world, hi world, hola world"
print(string.find(text, "[^hello] world"))
In this example, the program will try to match any part of the string that doesn't start with h or e or l or o characters and has a space and world next to it. But I want to make a word-set similar to this that can match the entire word and find a part of the string that doesn't start with the word hello and has the space and world next to it.
What I've tried:
local text = "hello world, hi world, hola world"
print(string.find(text, "[^h][^e][^l][^l][^o] world"))
It didn't work for some reason.
I am trying to create a word-set (class) instead of the char-set (class) in Lua.
This is not possible in the general case. Lua patterns operate at a character level: Quantifiers can only be applied to characters or character sets (and some special pattern items), but there exists no alternation, no "subexpressions" etc. Patterns don't have the expressive power required for this.
local text = "hello world, hi world, hola world"
print(string.find(text, "[^h][^e][^l][^l][^o] world"))
what this pattern translates to is: "find world
preceded by a space and 5 characters where each character may not be the respective character of hello world. This means all of the following won't match:
hi world
: Only three characters before world
hxxxx world
: First character is the same as the first character of hello
... hola world
: The l
from hola
is at the same position as the second l
from hello
To find world
not preceded by hello
I would combine multiple calls to string.find
to search through the string, always looking for a preceding hello
:
-- str: Subject string to search
-- needle: String to search for
-- disallowed_prefix: String that may not immediately precede the needle
-- plain: Disable pattern matching
-- init: Start at a certain position
local function string_find_prefix(str, needle, disallowed_prefix, plain, init)
local needle_start, needle_end = str:find(needle, init or 1, plain)
if not needle_start then return end -- needle not found
local _, prefix_end = str:find(disallowed_prefix, init or 1, plain)
-- needle may not be prefixed with the disallowed prefix
if (not prefix_end) or needle_start > prefix_end + 1 then
-- prefix not found or needle starts after prefix, return match
return needle_start, needle_end
end
return string_find_prefix(str, needle, disallowed_prefix, plain, prefix_end + 2)
end
print(string_find_prefix("hello world, hi world, hola world", "world", "hello ")) -- 17 21: Inclusive indices of the `world` after `hi`