for building a clean canonical url, that always returns 1 base URL, im stuck in following case:
<?php
# every page
$extensions = $_SERVER['REQUEST_URI']; # path like: /en/home.ast?ln=ja
$qsIndex = strpos($extensions, '?'); # removes the ?ln=de part
$pageclean = $qsIndex !== FALSE ? substr($extensions, 0, $qsIndex) : $extensions;
$canonical = "http://website.com" . $pageclean; # basic canonical url
?>
<html><head><link rel="canonical" href="<?=$canonical?>"></head>
when URL : http://website.com/de/home.ext?ln=de
canonical: http://website.com/de/home.ext
BUT I want to remove the file extension aswell, whether its .php, .ext .inc or whatever two or three char extension .[xx]
or .[xxx]
so the base url becomes: http://website.com/en/home
Aaah much nicer! but How do i achieve that in current code? Any hints are much appreciated +!
Think this should do it, just strip off the end if there is an extension, just like you did for the query string:
$pageclean = $qsIndex !== FALSE ? substr($extensions, 0, $qsIndex) : $extensions;
$dotIndex = strrpos($pageclean, '.');
$pagecleanNoExt = $dotIndex !== FALSE ? substr($pageclean, 0, $dotIndex) : $pageclean;
$canonical = "http://website.com" . $pagecleanNoExt; # basic canonical url