Search code examples
phpurlpreg-match

How to check the URL's structure using PHP preg_match?


All my site's URLs have the following structure:

https://www.example.com/section/item

where section is a word and item is a number.

So, possible URLs are:

https://www.example.com

https://www.example.com/section

https://www.example.com/section/item

By .htaccess, all requests go to index.php (route).

I want to show a 404 error message if user types:

https://www.example.com/section/item/somethingelse

In order to check the URL's structure, how can I change the pattern properly in the following function?

function isValidURL($url) {
    return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url);
}

Thanks.


Solution

  • If section is a word (and can not contain digits), and item is a number, you could match word characters except digits using [^\W\d]+ and \d+ to match 1+ digits.

    As in the example data there are optional parts, you could replace (/.*)?$ with (?:/[^\W\d]+(?:/\d+)?)?$.

    Explanation

    • (?: Non capturing group
      • /[^\W\d]+ For section, match 1+ times a word char except a digit
      • (?:/\d+)? For item, optionally match / and 1+ digits
    • )? Close non capturing group and make it optional

    If section can be a word which can also consists of only digits, you could also use \w+

    The pattern might look like

    ^https?://[a-z0-9-]+(?:\.[a-z0-9-]+)*(?::[0-9]+)?(?:/[^\W\d]+(?:/\d+)?)?$
    

    Regex demo

    Note to escape the dot to match it literally.