Search code examples
phpsearchsubstringpreg-replacestr-replace

Word search Replace and Limit specific Character using regex php


I want to replace specific words between <loc> and </loc> then limit the word to Specific number.

<?php
    $string = '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>https://subdomain.example.com</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url>
    <url>
    <loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url></urlset>';
    
    $search = "/(<loc>)(.*?)(<\/loc>)/";
    $replace =  mb_strimwidth('$2', 0, 15);
    $total = preg_replace($search,$replace,$string);
    echo $total;
?>

I have tried and its not working... please kindly help me out, thank you in advance


Solution

  • You have XML which is more than just a string, and I would recommend using tools that are aware of XML itself such as DOMDocument. I don't know what specific logic you are trying to do, and I didn't know that mb_strimwidth existed even, but this could be written as:

    $xml = <<<EOT
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>https://subdomain.example.com</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url>
    <url>
    <loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url></urlset>
    EOT;
    
    $dom = new DOMDocument;
    $dom->loadXML($xml);
    
    foreach($dom->getElementsByTagName('loc') as $node) {
        if ((XML_ELEMENT_NODE === $node->nodeType) && ('loc' === $node->nodeName)){
            $node->nodeValue = mb_strimwidth($node->nodeValue, 0, 15);
        }
    }
    
    echo $dom->saveHTML();
    

    Demo here: https://3v4l.org/fvS02

    Note: You appear to be doing something with the URL. Once again, a URL is more than just a string and PHP has parse_url for parsing URLs which I'd encourage you to use, if that is indeed what you are doing.

    EDIT

    If your source data isn't XML, I'd still use a parser if possible. DOMDocument supports HTML, too, you just need to suppress some warnings because HTML isn't usually as strict.

    But if your data doesn't have a parser, then it might be better to use RegEx. For this I think I'd want to use a callback function to determine the logic for what to replace with.

    $xml = <<<EOT
    <loc>https://subdomain.example.com</loc>
    <loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
    <loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
    <loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
    EOT;
    
    var_dump(
        preg_replace_callback(
            '/<loc>(?<value>[^<]+)<\/loc>/',
            static function($matches) {
                return sprintf('<loc>%1$s</loc>', mb_strimwidth($matches['value'], 0, 15));
            },
            $xml
        )
    );
    

    Demo: https://3v4l.org/OhmtZ