Search code examples
phpregexpreg-match

PHP preg_match returning character count of string instead of expected result


I am trying to search for variables being defined in a shell script.

<?php
$code = '
#!/bin/bash
foo = "Hello world!"
bar="123"
echo -e "The value of foo is $foo\n"
echo -e "The value of bar is $bar"
';
$var_pattern = "/(^[a-zA-Z0-9_]+[\= ]+([\"\']?)+(.)+([\"\']?))*$/";
preg_match($var_pattern, $code, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);

There are two variables being defined in the above example (foo & bar). The regex I have checked using regex101.com.

The result I am getting is...

Array
(
    [0] => Array
        (
            [0] => 
            [1] => 121
        )

)

121 appears to be the number of chars within the code. The result I am expecting is something more like...

Array
(
    [0] => Array
        (
            [0] => 
            [1] => foo = "Hello world!"
        ),
    [1] => Array
        (
            [0] =>
            [1] => bar="123"
        )
)

Or similar! What am I doing wrong?


Solution

  • Your original approach had a couple of woes:

    • Was missing the /m multiline flag, so ^$ didn't anchor lines.
    • The outer (…)* capture group was completely optionalized *
    • While (.)+ would capture just one letter of the contained value.
    • And PREG_OFFSET_CAPTURE is redundant unless you actually want match positions.

    Since this is more or less a classic ini-style format, you can simply use:

                         key                  value        multiline
                          ↑                     ↑             ↑
     preg_match_all("/^ (\w+) \s*=\s* [\"\']? (.+?) [\"\']? $/mix", $str, $m);
                                 ↓       ↓             ↓
                               equal   quote         quote
    

    And after that you can even reconstruct an associative key→value array from the $m matches array, with =array_combine($m[1], $m[2]).