There is a string consisting of maximum three parts: Writer
, Director
, and Producer
. Let's call them "categories". Each category consists of two parts separated by a colon: Label : Names
, where Label
is one of the mentioned category names, and Names
is a list of names separated by slashes. E.g.:
Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith
I want to break the string into parts by the category names and the name lists with preg_match
function. Here is what I have so far:
$pattern = '/Writer : (?P<Writer>[\s\S]+?)Director : (?P<Director>[\s\S]+?)Producer : (?P<Producer>[\s\S]+)/';
$sentence = 'Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith';
preg_match($pattern, $sentence, $matches);
foreach($matches as $cat => $match) {
// Do more
// echo "<b>" . $cat . "</b>" . $match . "<br />";
}
The script works well, if there are exactly all three categories in the string. It fails, if at least one of the categories is missing.
One way is to create optional groups with the well-known ?
quantifier:
$pattern = '/^' .
'(?:Writer *: *(?P<Writer>[^:]+))?' .
'(?:Director *: *(?P<Director>[^:]+))?' .
'(?:Producer *: *(?P<Producer>[^:]+))?' .
'$/';
preg_match($pattern, $sentence, $matches);
where (?:)
creates a non-capturing group. Note, the output array will be indexed by both numeric position indexes and names, e.g.:
Array
(
[0] => Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith
[Writer] => Jeffrey Schenck / Peter Sullivan /
[1] => Jeffrey Schenck / Peter Sullivan /
[Director] => Brian Trenchard-Smith / jack /
[2] => Brian Trenchard-Smith / jack /
[Producer] => smith
[3] => smith
)
Another way is to use preg_match_all
with extra processing:
$pattern = '/(?<=:)[^:]+/';
if (preg_match_all($pattern, $sentence, $matches)) {
$keys = ['Writer', 'Director', 'Producer'];
for ($i = 0; $i < count($matches[0]); ++$i)
// The isset() checks are skipped for clarity's sake
$a[$keys[$i]] = $matches[0][$i];
print_r($a);
}
where (?<=:)
is a positive lookbehind assertion for the :
character. In this case, the resulting array will have a neat appearance:
Array
(
[Writer] => Jeffrey Schenck / Peter Sullivan / Director
[Director] => Brian Trenchard-Smith / jack / Producer
[Producer] => smith
)