Search code examples
phpregexpreg-match

preg_match only returning the first match


I've written a regular expression that's supposed to parse through given content and return an array of matches. The preg_match function is as follows:

<?php preg_match("/\[tab *title=(\”|\"|\’|\'|).*(\”|\"|\’|\'|)\]/i", $content, $tabs); ?>

This matches any of the following (or variations) just fine, according to RegExr:

[tab title="example"]
[tab  title="example"]
[tab title='example']
[tab title=example]
[TAB TITLE="example"]
[tab title=”example”]
[tab title=’example’]

I can get my preg_match to return an array, but it only shows the first match:

Array
(
    [0] => [tab title=’Admission’]
    [1] => 
    [2] => 
)

I'm very new to regex, and this is my first time trying to do it on my own. I'm sure I'm missing something obvious. Why does this array only show the first match?

The example data I'm trying to parse is below:

[tab-group]

[tab title='Admission']

Tab Content Here

[/tab]

[tab title="Amenities"]

Tab Content Here

[/tab]

[tab title="Season Passes"]

Tab Content Here

[/tab]

[tab title="Hours"]

Tab Content Here

[/tab]

[/tab-group]

UPDATE: I just found preg_match_all and that appears to be matching correctly, except it's adding two additional arrays to the end for some reason:

Array
(
    [0] => Array
        (
            [0] => [tab title=’Admission’]
            [1] => [tab title=”Amenities”]
            [2] => [tab title=”Season Passes”]
            [3] => [tab title=”Hours”]
        )

    [1] => Array
        (
            [0] => 
            [1] => 
            [2] => 
            [3] => 
        )

    [2] => Array
        (
            [0] => 
            [1] => 
            [2] => 
            [3] => 
        )

)

From my sample data, is it clear why those arrays are getting added?


Solution

  • When you use parentheses in a regexp, they serve two purposes: they can be used to group parts of the expression, and they also cause the parts of the string that match that group to be "captured" and returned in the match data. Those extra arrays are the captured strings. You can make a group non-capturing by putting ?: at the beginning:

    (?:\”|\"|\’|\'|)
    

    But in your regexp you don't need groups at all, you can use character sets in square brackets:

    preg_match_all("/\[tab *title=[”\"’']?.*[”\"’']?\]/i", $content, $tabs);
    

    Putting ? after the character set makes it optional, so it will also match the empty string like the last alternative of your group.

    But there's not much point in having an optional character around .*. It will match the same thing as title=.*