Search code examples
phpregexregex-lookarounds

regex not matching two delimiter chars


I have a long string that contains url, title, description like this:

||url:foo||title:Books|Pencils||description:my description||

Data is always starting and ending with ||

How can I match that title (Books|Pencils) which might include an | (as in above sample) but not two || ?

I tried rules like these (using php preg_match):

#\|\|title:(.*)([\|\|])(.*)#


Solution

  • I guess,

    (?<=\|\|title:).*?(?=\|\|)
    

    might simply do that.

    RegEx Demo 1

    and if you want to get the other two,

    (?<=\|\|\burl:|\btitle:|\bdescription:).*?(?=\|\|)
    

    RegEx Demo 2

    Test

    $re = '/(?<=\|\|\burl:|\btitle:|\bdescription:).*?(?=\|\|)/m';
    $str = '||url:foo||title:Books|Pencils||description:my description||';
    
    preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
    
    var_dump($matches);
    

    Output

    array(3) {
      [0]=>
      array(1) {
        [0]=>
        string(3) "foo"
      }
      [1]=>
      array(1) {
        [0]=>
        string(13) "Books|Pencils"
      }
      [2]=>
      array(1) {
        [0]=>
        string(14) "my description"
      }
    }
    

    If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.