Currently, I am using this function:
function tokenize( str )
local ret = {}
string.gsub( str, "([-%w%p()%[%]®+]+)", function( s ) table.insert( ret, s ) end )
return ret
end
Now, the string can have any character in it(as is clear from function above). I want to break the string to words detecting only the white-spaces and no other character. I have seen the solution mentioned here but it is not helping me even on codepad.org (link). I am working in PtokaX, in case you are wondering. I have tried using
print( split( 'foo/bar/baz/test','/' ) )
too, but that doesn't work either. :(
Is there any other easier way to create the table?
Why don't you just match for non space characters, instead of matching all others?
function tokenize( str )
local ret = {}
string.gsub( str, "(%S+)", function( s ) table.insert( ret, s ) end )
return ret
end
If you want to use other characters for splitting, the pattern set negation is also useful:
s='foo#bar!baz*'
s:gsub('([^#!%*]+)',function(s) print(s) end)
See also: Patterns in the Lua Manual. Also keep in mind Lua patterns are not the same as regexes, they are lighter, but have their limitations.