Search code examples
phpregexdomain-name

Get domain name (not subdomain) in php


I have a URL which can be any of the following formats:

http://example.com
https://example.com
http://example.com/foo
http://example.com/foo/bar
www.example.com
example.com
foo.example.com
www.foo.example.com
foo.bar.example.com
http://foo.bar.example.com/foo/bar
example.net/foo/bar

Essentially, I need to be able to match any normal URL. How can I extract example.com (or .net, whatever the tld happens to be. I need this to work with any TLD.) from all of these via a single regex?


Solution

  • Well you can use parse_url to get the host:

    $info = parse_url($url);
    $host = $info['host'];
    

    Then, you can do some fancy stuff to get only the TLD and the Host

    $host_names = explode(".", $host);
    $bottom_host_name = $host_names[count($host_names)-2] . "." . $host_names[count($host_names)-1];
    

    Not very elegant, but should work.


    If you want an explanation, here it goes:

    First we grab everything between the scheme (http://, etc), by using parse_url's capabilities to... well.... parse URL's. :)

    Then we take the host name, and separate it into an array based on where the periods fall, so test.world.hello.myname would become:

    array("test", "world", "hello", "myname");
    

    After that, we take the number of elements in the array (4).

    Then, we subtract 2 from it to get the second to last string (the hostname, or example, in your example)

    Then, we subtract 1 from it to get the last string (because array keys start at 0), also known as the TLD

    Then we combine those two parts with a period, and you have your base host name.