Search code examples
phpregexpreg-replaceword-boundary

PHP preg_replace word boundary and dot


<?php
$name = 'malines featuring. malines2 featuring malines3 feat. malines4 feat malines5';
$name = preg_replace('/\b(prod|by|ft|feat(uring)?)(\.)?\b/i', '', $name);
?>

This code removes "prod", "by", "ft", "feat", "featuring" and should removes also with dots. It removes but leaves dots. Please for help. Do you have any idea how should look the correct code?


Solution

  • The (\.)?\b is causing you problems - there is no word boundary immediately after a period.

    Try this regex instead: /\b(prod|by|ft|feat(uring)?)(\.|\b)/i

    UPDATE: So, as I understand it, you only want to replace abbreviations that are not immediately followed by more text.

    Example:

    Prod. whatever -> whatever, but

    Prod.whatever -> Prod.whatever (no change).

    Is this correct? If so, then how about this solution? Only replace it if the next character is not a letter. You could use:

    /\b(prod|by|ft|feat(uring)?)\b(?!\.[a-z])\.? */i

    I ran the following tests and got the results indicated in comments. Please let me know if this doesn't work for you.

    $regex = '/\b(prod|by|ft|feat(uring)?)\b(?!\.[a-z])\.? */i';
    $rep = '';
    
    echo preg_replace($regex, $rep, 'prod. bla')." <br/>"; // bla
    echo preg_replace($regex, $rep, 'prod.bla')." <br/>"; // prod.bla
    echo preg_replace($regex, $rep, 'feat bla')." <br/>"; // bla
    echo preg_replace($regex, $rep, 'feat. bla')." <br/>"; // bla
    echo preg_replace($regex, $rep, 'feat.bla')." <br/>"; // feat.bla
    echo preg_replace($regex, $rep, 'featuring bla')." <br/>"; // bla
    echo preg_replace($regex, $rep, 'featuring.bla')." <br/>"; // featuring.bla