preg_match not working when wanting to detect multiple urls

I want to automatically detect any link in a string and replace them with a [index of link]. For example if i have a string like test https://www.google.com/ mmh http://stackoverflow.com/ the result will be test [0] mmh [1].

Right now i tried with this

$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $_POST['commento'], $url)) {
    for ($i = 0; $i < count($url); $i++) { 
        $_POST['commento'] = preg_replace($reg_exUrl, "[" . $i . "]", $_POST['commento']);
    }
}

but i keep getting test [0] mmh [0], if i try a var_dump(count($url)) i always get 1 as a result. How do i fix this?

Solution

So, an even better solution here would be to split the incoming string into an array of strings between each url segment and then insert [$i] between consecutive non-url components.

# better solution, perform a split.
function process_line2($input) {
    $regex_url = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    # split the incoming string into an array of non-url segments
    # preg_split does not trim leading or trailing empty segments
    $non_url_segments = preg_split($regex_url, $input, -1);

    # inside the array, combine each successive non-url segment
    # with the next index
    $out = [];
    $count = count($non_url_segments);
    for ($i = 0; $i < $count; $i++) {
        # add the segment
        array_push($out, $non_url_segments[$i]);
        # add its index surrounded by brackets on all segments but the last one
        if ($i < $count -1) {
            array_push($out, '[' . $i . ']');
        }
    }
    # join strings with no whitespace
    return implode('', $out);
}

preg_match only returns the first result, so it doesn't give you the number of urls matching your regular expression. You need to extract the first element of the array returned by preg_match_all.

The second error is that you are not using the limit argument of preg_replace, so all of your urls are getting replaced at the same time.

From the documentation for preg_replace: http://php.net/manual/en/function.preg-replace.php

The parameters are

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

in particular the limit parameter defaults to -1 (no limit)

limit: The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).

You need to set an explicit limit of 1.

Elaborating a bit on replacing preg_match with preg_match_all, you need to extract the [0] component from it since preg_match_all returns an array of arrays. For example:

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(23) "https://www.google.com/"
    [1]=>
    string(25) "http://stackoverflow.com/"
  }
}

Here is an example with the fixes incorporated.

<?php 

# original function
function process_line($input) {

    $reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    if(preg_match($reg_exUrl, $input, $url)) {
        for ($i = 0; $i < count($url); $i++) { 
            $input = preg_replace($reg_exUrl, "[" . $i . "]", $input);
        }
    }

    return $input;

}

# function with fixes incorporated
function process_line1($input) {

    $reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    if(preg_match_all($reg_exUrl, $input, $url)) {
        $url_matches = $url[0];
        for ($i = 0; $i < count($url_matches); $i++) { 
            echo $i;
            # add explicit limit of 1 to arguments of preg_replace
            $input = preg_replace($reg_exUrl, "[" . $i . "]", $input, 1);
        }
    }

    return $input;

}

$input = "test https://www.google.com/ mmh http://stackoverflow.com/";

$input = process_line1($input);

echo $input;

?>