I have this XMP string $xml:
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b" xmlns:xmp="http://ns.adobe.com/xap/1.0/">
<xmp:CreatorTool>Microsoft Photo Gallery 16.4.3528.331</xmp:CreatorTool>
<xmp:Rating>2</xmp:Rating>
</rdf:Description>
<rdf:Description rdf:about="uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b" xmlns:MP="http://ns.microsoft.com/photo/1.2/">
<MP:RegionInfo>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPRI:Regions xmlns:MPRI="http://ns.microsoft.com/photo/1.2/t/RegionInfo#">
<rdf:Bag xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.144259, 0.358824, 0.065751, 0.098529</MPReg:Rectangle>
</rdf:Description>
</rdf:li>
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.211973, 0.294118, 0.023553, 0.035294</MPReg:Rectangle>
</rdf:Description>
</rdf:li>
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.350343, 0.423529, 0.056919, 0.085294</MPReg:Rectangle>
<MPReg:PersonDisplayName xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">xc</MPReg:PersonDisplayName>
</rdf:Description>
</rdf:li>
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.352306, 0.300000, 0.023553, 0.035294</MPReg:Rectangle>
</rdf:Description>
</rdf:li>
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.395486, 0.304412, 0.047105, 0.070588</MPReg:Rectangle>
</rdf:Description>
</rdf:li>
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<MPReg:Rectangle xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#">0.823356, 0.560294, 0.095191, 0.142647</MPReg:Rectangle>
</rdf:Description>
</rdf:li>
</rdf:Bag>
</MPRI:Regions>
</rdf:Description>
</MP:RegionInfo>
</rdf:Description>
<rdf:Description xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0/">
<MicrosoftPhoto:Rating>25</MicrosoftPhoto:Rating>
</rdf:Description>
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>
<rdf:Alt xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:li xml:lang="x-default">edgf</rdf:li>
</rdf:Alt>
</dc:title>
</rdf:Description>
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:description>
<rdf:Alt xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:li xml:lang="x-default">edgf</rdf:li>
</rdf:Alt>
</dc:description>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
I am intrested in 2 tags: MPReg:Rectangle and MPReg:PersonDisplayName. I want to read the Rectangle value only if there is a PersonDisplayname in the same tag.
I tried converting the XMP to an array using this code:
function get_xmp_array( &$xmp_raw ) {
$xmp_arr = array();
foreach ( array(
'RectangleCoords' => '<MPReg:Rectangle[^>]+?xmlns:MPReg="([^"]*)"',
'attempt2' => '<MPReg:Rectangle>\s*(.*?)\s*<\/MPReg:Rectangle>'
) as $key => $regex ) {
// get a single text string
$xmp_arr[$key] = preg_match( "/$regex/is", $xmp_raw, $match ) ? $match[1] : '';
// if string contains a list, then re-assign the variable as an array with the list elements
$xmp_arr[$key] = preg_match_all( "/<rdf:li[^>]*>([^>]*)<\/rdf:li>/is", $xmp_arr[$key], $match ) ? $match[1] : $xmp_arr[$key];
// hierarchical keywords need to be split into a third dimension
if ( ! empty( $xmp_arr[$key] ) && $key == 'Hierarchical Keywords' ) {
foreach ( $xmp_arr[$key] as $li => $val ) $xmp_arr[$key][$li] = explode( '|', $val );
unset ( $li, $val );
}
}
return $xmp_arr;
}
But it did not work, it returned this:
'RectangleCoords' => string 'http://ns.microsoft.com/photo/1.2/t/Region#'
'attempt2' => string ''
I tried multiple functions like:
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
This function only returned the first match, I dunno how to get all the matches.
I also tried this:
$doc = new DOMDocument();
$doc->loadXML($xml);
$result = $doc->getElementsByTagName('MPReg:Rectangle');
var_dump( $result );
But it returned nothing:
object(DOMNodeList)[3]
I would really appreciate your help on this one.
Thank you
Do not use Regular Expression to parse XML. Use an XML parser (DOM) and Xpath. Xpath is an expression language to select nodes of a DOM.
First create a DOM document, load the XML and create an Xpath instance for the document.
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
The XML uses namespaces, so now you have to register prefixes aliases for them. The alias in the XML is only valid for the document. At parse time the DOM resolves the namespaces. You can read the root node as {adobe:ns:meta/}:xmpmeta
.
$xpath->registerNamespace('x', 'adobe:ns:meta/');
$xpath->registerNamespace('xmp', 'http://ns.adobe.com/xap/1.0/');
$xpath->registerNamespace('rdf', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#');
$xpath->registerNamespace('MP', 'http://ns.microsoft.com/photo/1.2/');
$xpath->registerNamespace('MPRI', 'http://ns.microsoft.com/photo/1.2/t/RegionInfo#');
$xpath->registerNamespace('MPReg', 'http://ns.microsoft.com/photo/1.2/t/Region#');
This allows the Xpath instance to resolve namespaces. The expression /x:xmpmeta
can be resolved to /{adobe:ns:meta/}:xmpmeta
and match the root node, even if the namespace prefix/alias was different.
Now you can use DOMXpath::evaluate()
to fetch nodes and values:
foreach ($xpath->evaluate('//MPRI:Regions//rdf:Description') as $description) {
var_dump(
[
'rectangle' => $xpath->evaluate('string(MPReg:Rectangle)', $description),
'person' => $xpath->evaluate('string(MPReg:PersonDisplayName)', $description),
]
);
}
The expression //MPRI:Regions//rdf:Description
fetches all rdf description elements inside a mpri regions element node. For each description the two expression fetch the rectangle (string(MPReg:Rectangle)
) and the person display name (string(MPReg:PersonDisplayName)
) as string.
Output:
array(2) {
["rectangle"]=>
string(38) "0.144259, 0.358824, 0.065751, 0.098529"
["person"]=>
string(0) ""
}
array(2) {
["rectangle"]=>
string(38) "0.211973, 0.294118, 0.023553, 0.035294"
["person"]=>
string(0) ""
}
array(2) {
["rectangle"]=>
string(38) "0.350343, 0.423529, 0.056919, 0.085294"
["person"]=>
string(2) "xc"
}
array(2) {
["rectangle"]=>
string(38) "0.352306, 0.300000, 0.023553, 0.035294"
["person"]=>
string(0) ""
}
array(2) {
["rectangle"]=>
string(38) "0.395486, 0.304412, 0.047105, 0.070588"
["person"]=>
string(0) ""
}
array(2) {
["rectangle"]=>
string(38) "0.823356, 0.560294, 0.095191, 0.142647"
["person"]=>
string(0) ""
}