Search code examples
phpregexurlsquare-bracket

PHP replace patterns in string with URLs


I have a few hundred html files that I want to show on a website. They all have links in them in the following format:

[My Test URL|https://www.mywebsite.com/test?param=123]

The problem is that some urls are split up so:

[My Test URL|https://www.mywebsite.c om/test?param=123]

I want to replace all of those with the HTML counter-part so:

<a href="https://www.mywebsite.com/test?param=123">My Test URL</a>

I know that with the regex "/[(.*?)]/" I can match the brackets but how can I split by the pipe, remove the whitespaces in the URL and convert everything to a string?


Solution

  • If you just want to remove (white)spaces in the URL part in these markdown links you can use a mere preg_replace like

    preg_replace('~(?:\G(?!\A)|\[[^][|]*\|)[^][\s]*\K\s+(?=[^][]*])~', '', $text)
    

    See the regex demo. Details:

    • (?:\G(?!\A)|\[[^][|]*\|) - end of the previous match or [, then zero or more chars other than [, ] and | and then a | char
    • [^][\s]* - zero or more chars other than [, ] and whitespace
    • \K - discard all text matched so far
    • \s+ - one or more whitespaces
    • (?=[^][]*]) - there must be zero or more chars other than [ and ] and then a ] immediately to the right of the current location.

    If you want to remove spaces inside the URL part and convert markdown to HTML, you had better use preg_replace_callback:

    $text = '[My Test URL|https://ww w.mywebsite.c om/t  est?param=123]';
    
    echo preg_replace_callback('/\[([^][|]*)\|([^][]+)]/', function($m) {
        return '<a href="' . str_replace(' ', '', $m[2]) . '">' . $m[1] . '</a>';
    }, $text);
    

    See the PHP demo. Details:

    • \[ - a [ char
    • ([^][|]*) - Group 1: any zero or more chars other than [, ] and |
    • \| - a | char
    • ([^][]+) - Group 2: any one or more chars other than ] and [
    • ] - a ] char.