Search code examples
phphttp-referer

How to group host from http referer


how to group same name but diferent tld.

for example google.com, google.co.id, google.co.jp, etc.

i want to group all of google.* as google.

here my code to get host from http referer.

if (isset($_SERVER["HTTP_REFERER"])) {
    $referal = $_SERVER["HTTP_REFERER"];
    $host_referal = parse_url($_SERVER['HTTP_REFERER'], PHP_URL_HOST);
}
else {
    $referal = "Unknown";
}

Solution

  • Just spitballing here, I didn't consider any fringe cases.

    Code: (Demo)

    $referers=['https://google.com','https://www.google.co.id','http://www.google.co.jp'];  // $_SERVER['HTTP_REFERER']
    foreach($referers as $referer){
        if(!$referer || !$host=parse_url($referer,PHP_URL_HOST)){  // sometimes $_SERVER['HTTP_REFERER'] is not delivered
            echo "couldn't parse missing/malformed url";
        }else{
            echo preg_match('~(?:https?://)?(?:www\.)?\K[^.]+~',$host,$out)?$out[0]:'';
            echo "\n";
        }
    }
    

    Output:

    google
    google
    google
    

    If this breaks, please offer me the breaking input string so that I can adjust my method.


    p.s. The truth is, you can probably get away with just calling:

    $referal=preg_match('~^(?:https?://)?(?:www\.)?\K[^.]+~',$_SERVER["HTTP_REFERER"],$out)?$out[0]:'Unknown'
    

    But there are many posts on StackOverflow that state that this value is not secure, so using parse_url() offers a bit more peace of mind.