Search code examples
phpregexpreg-match-all

PHP preg_match_all - extract content from pattern in different order


I'm cleaning up some wordpress short codes in my code and I'm looking for a solution that would extract the right values no matter the order of the values.

Example:

[Links label="my_label" url="my_url" external="other_value"]

If I want to extract my_label, my_url and other_value, I would use the following structure:

preg_match_all('/\[Links label=\"(.*?)\" url=\"(.*?)\" external=\"(.*?)\"\]/', $content, $output_array);

The problem is that I sometimes have a different order like this:

[Links url="my_url" external="other_value" label="my_label"]

My previous preg_match_all doesn't work with this. I have tried to put each pattern between (...) or use | but I don't get the expected result. I have seen solutions here to identify strings but I need more than identifying strings, I need to extract values.

It's probably something trivial for a regex expert.

Thanks


Solution

  • If the properties could also be a different amount in any order and should start with [Links , you can make use of the \G anchor. The key is in capture group 1, the value in capture group 2.

    (?:\[Links|\G(?!^))(?=[^][]*])\h+([^\s=]+)="([^\s"]+)"
    

    Explanation

    • (?: Non capture group
      • \[Links Match [Links
      • | Or
      • \G(?!^) Assert the position at the end of the previous match, not at the start
    • ) Close non capture group
    • (?=[^][]*]) Positive lookahead, assert a ] at the right
    • \h+ Match 1+ horizontal whitespace chars
    • ( Capture group 1
      • [^\s=]+ Match 1+ times any char except = or a whitespace char
    • ) Close group 1
    • =" Match literally
    • ( Capture group 2
      • [^\s"]+ Match 1+ times any char except " or a whitespace char
    • )" Close group 2 and match "

    Regex demo

    Example

    $re = '/(?:\[Links|\G(?!^))(?=[^][]*])\h+([^\s=]+)="([^\s"]+)"/m';
    $str = '[Links label="my_label" url="my_url" external="other_value"]';
    
    preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
    print_r($matches);
    

    Output

    Array
    (
        [0] => Array
            (
                [0] => [Links label="my_label"
                [1] => label
                [2] => my_label
            )
    
        [1] => Array
            (
                [0] =>  url="my_url"
                [1] => url
                [2] => my_url
            )
    
        [2] => Array
            (
                [0] =>  external="other_value"
                [1] => external
                [2] => other_value
            )
    
    )
    

    Php demo