Trying some regex performance tests (heard some rumors that erlang is slow)
>Fun = fun F(X) -> case X > 1000000 of true -> ok; false -> Y = X + 1, re:run(<<"1ab1jgjggghjgjgjhhhhhhhhhhhhhjgdfgfdgdfgdfgdfgdfgdfgdfgdfgdfgfgv">>, "^[a-zA-Z0-9_]+$"), F(Y) end end.
#Fun<erl_eval.30.128620087>
> timer:tc(Fun, [0]).
{17233982,ok}
> timer:tc(Fun, [0]).
{17155982,ok}
and some tests after compiling regex
{ok, MP} = re:compile("^[a-zA-Z0-9_]+$").
{ok,{re_pattern,0,0,0,
<<69,82,67,80,107,0,0,0,16,0,0,0,1,0,0,0,255,255,255,
255,255,255,...>>}}
> Fun = fun F(X) -> case X > 1000000 of true -> ok; false -> Y = X + 1, re:run(<<"1ab1jgjggghjgjgjhhhhhhhhhhhhhjgdfgfdgdfgdfgdfgdfgdfgdfgdfgdfgfgv">>, MP), F(Y) end end.
#Fun<erl_eval.30.128620087>
> timer:tc(Fun, [0]).
{15796985,ok}
>
> timer:tc(Fun, [0]).
{15921984,ok}
http://erlang.org/doc/man/timer.html :
Unless otherwise stated, time is always measured in milliseconds.
http://erlang.org/doc/man/re.html#compile-1 :
Compiling the regular expression before matching is useful if the same expression is to be used in matching against multiple subjects during the lifetime of the program. Compiling once and executing many times is far more efficient than compiling each time one wants to match.
Questions
Yes, you should compile the code before trying to measure performance. When you type the code into the shell, the code will be interpreted, not compiled into byte code. I saw a big improvement when putting the code into a module:
7> timer:tc(Fun, [0]).
{6253194,ok}
8> timer:tc(fun foo:run/1, [0]).
{1768831,ok}
(Both of those are with compiled regexp.)
-module(foo).
-compile(export_all).
run(X) ->
{ok, MP} = re:compile("^[a-zA-Z0-9_]+$"),
run(X, MP).
run(X, _MP) when X > 1000000 ->
ok;
run(X, MP) ->
Y = X + 1,
re:run(<<"1ab1jgjggghjgjgjhhhhhhhhhhhhhjgdfgfdgdfgdfgdfgdfgdfgdfgdfgdfgfgv">>, MP),
run(Y).