Search code examples
phpregexwordpresstext-parsingwordpress-shortcode

Parse and replace WordPress shortcode containing additional attributes


I need to search in wordpress posts for posts that have the gallery included in the content with the photos inserted from the media library and not attached to the posts (this is the wordpress part, if you don't get it, that's fine, this is just to explain what I'm doing.

Anyway, the $subject looks like this:

"some text or no text at all [gallery order="DESC" orderby="title" link="file" include="1089,1090,1091,1099,1105,1109,1113"]some text or no text at all"

What I need is the list of numbers from the include. Note that the other arguments (like order, link, etc) are not mandatory.

I tried myself but I just can't make it work, this is what I've done:

$pattern = '/^.\[gallery.include="(?)".\]$/';
$subject = '[gallery order="DESC" orderby="title" link="file" include="1089,1090,1091,1099,1105,1109,1113"]';
var_dump(preg_match($pattern, $subject, $matches));
print_r($matches);

Prints:

int(0) Array ( )

Solution

  • Try with:

    $pattern = '/^\[gallery.*?\sinclude="([^"]*)".*\]$/';
    

    A single . in a regex only matches one character. .* matches any number of characters, greedily (i.e. it goes as far as it can). .*? does the same thing, but non-greedily (i.e. it stops as soon as the next part gets a match more or less).

    The capture part is [^"]* which is an inverted character class which means "anything but a " repeated zero or more times".

    And you need to escape [ and ] because they are the start/end markers for character classes.

    If you want to be able to have text before or after the [thing], remove the anchors (first ^, and last $).