Search code examples
phpregexpreg-replaceregex-groupregex-greedy

How to get part of a URL?


How can I remove all the parts from url except base url and first part. There is no certainty in number of parts. Base url is variable. I tried some regex but in vain.

$url =  http://www.example.com/part1/part2/part3/part4;
base_url = parse_url($url, PHP_URL_HOST); // Outputs www.example.com

$desired_output = http://www.example.com/part1;

Solution

  • Here we can use a preg_replace, with a simple expression, maybe similar to:

    (.+\.com\/.+?\/).+
    

    where we are capturing our desired output using this capturing group:

    (.+\.com\/.+?\/)
    

    and then we swipe to the end of string and replace it with $1.

    Test

    $re = '/(.+\.com\/.+?\/).+/m';
    $str = 'http://www.example.com/part1/part2/part3/part4';
    $subst = '$1';
    
    $result = preg_replace($re, $subst, $str);
    
    echo $result;
    

    DEMO

    RegEx Circuit

    jex.im visualizes regular expressions:

    enter image description here


    For all domains .com or not, we might be able to solve it with this expression:

    (.+\..+?\/.+?\/).+
    

    Test

    $re = '/(.+\..+?\/.+?\/).+/m';
    $str = 'http://www.example.com/part1/part2/part3/part4';
    $subst = '$1';
    
    $result = preg_replace($re, $subst, $str);
    
    echo $result;
    

    Demo