Search code examples
phpregexurlsanitization

Remove language-specifier from URL path string


I am looking for someone that can help me with a regex for the following.

I have this code: (have updated it)

<?php
$sitename = "http://" .$_SERVER["SERVER_NAME"];
$sitename = mysql_real_escape_string($sitename);
$language = "da";
$language = mysql_real_escape_string($language);
$pagename = $_SERVER["PHP_SELF"];
$pagename = mysql_real_escape_string($pagename);
$language1 = preg_replace("/$language/", "$1", "$pagename");
?>

I need a regex to strip the language from the URL(sitename) - This now works

How do I escape special characters?

The result from the above example leaves me with //index.asp and not with /index.asp.

Basically, what I want to do is to strip a constant (/da) from a URL:

the URL will look like this http://www.example.com/da/ or http://www.example.com/da/folder/folder/folder/page.asp

I only need to take the da out of the URL.

How will I do this in PHP?

Ok I seemed to have figured it out:

<?php
$sitename = "http://" .$_SERVER["SERVER_NAME"];
$sitename = mysql_real_escape_string($sitename);
$language = "\/da";
$pagename = $_SERVER["PHP_SELF"];
$pagename = mysql_real_escape_string($pagename);
$language1 = preg_replace("/$language/", "$1", "$pagename");
?>

I only needed to remove this line:

 $language = mysql_real_escape_string($language);

Solution

  • Replace:

    $re = '/(?:\w+:\/\/[\w][\w.]+\/)(\w+)\//ui';
    or this $re = '^(?:.*)\/(\w{2})\//ui';
    
    $text = http://www.domain.com/ru/
    
    preg_replace($re, 'ru', $text);
    
    --> *http://www.domain.com/ru/*
    

    Search:

    $re = '/(?<domain>\w+:\/\/[\w][\w.]+\/)(?<lang>\w+)\//ui';
    or this $re = '^(?:.*)\/(?<lang>\w{2})\//ui';
    
    $text = http://www.domain.com/ru/
    
    preg_match($re, $text, $aMatches);
    
    --> Array
    (
        [0] => http://www.domain.com/da/
        [domain] => http://www.domain.com/
        [1] => http://www.domain.com/
        [lang] => da
        [2] => da
    )