This regex should match lists just like in Markdown:
/((?:(?:(?:^[\+\*\-] )(?:[^\r\n]+))(?:\r|\n?))+)/m
It works in Javascript (with g
flag added) but I have problems porting it to PHP. It does not behave greedy. Here's my example code:
$string = preg_replace_callback('`((?:(?:(?:^\* )(?:[^\r\n]+))(?:\r|\n?))+)`m', array(&$this, 'bullet_list'), $string);
function bullet_list($matches) { var_dump($matches) }
When I feed to it a list of three lines it displays this:
array(2) { [0]=> string(6) "* one " [1]=> string(6) "* one " } array(2) { [0]=> string(6) "* two " [1]=> string(6) "* two " } array(2) { [0]=> string(8) "* three " [1]=> string(8) "* three " }
Apparently var_dump
is being called three times instead of just once as I expect from it since the regex is greedy and must match as many lines as possible. I have tested it on regex101.com.
How do I make it work properly?
This regex won't work correctly if you have \r\n
newlines in your input text.
The part (?:\r|\n?)
matches either an \r
or an \n
, but not both. (regex101 treats newlines as \n
only, so it works there).
Does the following work?
/(?:(?:(?:^[+*-] )(?:[^\r\n]+))[\r\n]*)+/m
(or, after removal of all the unnecessary non-capturing groups - thanks @M42!)
/(?:^[+*-] [^\r\n]+[\r\n]*)+/m