In Ruby, what's the easiest way to split a string in the following manner?
'abc+def'
should split to ['abc', '+', 'def']
'abc\*def+eee'
should split to ['abc', '\*', 'def', '+', 'eee']
'ab/cd*de+df'
should split to ['ab', '/', 'cd', '*', 'de', '+', 'df']
The idea is to split the string about these symbols: ['-', '+', '*', '/']
and also save those symbols in the result at appropriate locations.
Option 1
/\b/
is a word boundary and it has zero-width, so it will not consume any characters
'abc+def'.split(/\b/)
# => ["abc", "+", "def"]
'abc*def+eee'.split(/\b/)
# => ["abc", "*", "def", "+", "eee"]
'ab/cd*de+df'.split(/\b/)
# => ["ab", "/", "cd", "*", "de", "+", "df"]
Option 2
If your string contains other word boundary characters and you only want to split on -
, +
, *
, and /
, then you can use capture groups. If a capture group is used, String#split
will also include captured strings in the result. (Thanks for pointing this out @Jordan) (@Cary Swoveland sorry, I didn't see your answer when I made this edit)
'abc+def'.split /([+*\/-])/
# => ["abc", "+", "def"]
'abc*def+eee'.split /([+*\/-])/
# => ["abc", "*", "def", "+", "eee"]
'ab/cd*de+df'.split /([+*\/-])/
# => ["ab", "/", "cd", "*", "de", "+", "df"]
Option 3
Lastly, for those using a language that might not support string splitting with a capture group, you can use two lookarounds. Lookarounds are also zero-width matches, so they will not consume any characters
'abc+def'.split /(?=[+*\/-])|(?<=[+*\/-])/
# => ["abc", "+", "def"]
'abc*def+eee'.split /(?=[+*\/-])|(?<=[+*\/-])/
# => ["abc", "*", "def", "+", "eee"]
'ab/cd*de+df'.split /(?=[+*\/-])|(?<=[+*\/-])/
# => ["ab", "/", "cd", "*", "de", "+", "df"]
The idea here is to split on any character that is preceded by one of your separators, or any character that is followed by one of the separators. Let's do a little visual
ab ⍿ / ⍿ cd ⍿ * ⍿ de ⍿ + ⍿ df
The little ⍿
symbols are either preceded or followed by one of the separators. So this is where the string will get cut.
Option 4
Maybe your language doesn't have a string split
function or sensible ways to interact with regular expressions. It's nice to know you don't have to sit around guessing if there's clever built-in procedures that magically solve your problems. There's almost always a way to solve your problem using basic instructions
class String
def head
self[0]
end
def tail
self[1..-1]
end
def reduce acc, &f
if empty?
acc
else
tail.reduce yield(acc, head), &f
end
end
def separate chars
res, acc = reduce [[], ''] do |(res, acc), char|
if chars.include? char
[res + [acc, char], '']
else
[res, acc + char]
end
end
res + [acc]
end
end
'abc+def'.separate %w(- + / *)
# => ["abc", "+", "def"]
'abc*def+eee'.separate %w(- + / *)
# => ["abc", "*", "def", "+", "eee"]
'ab/cd*de+df'.separate %w(- + / *)
# => ["ab", "/", "cd", "*", "de", "+", "df"]