Search code examples
phparraysurltrimsanitize

Sanitize or trim URL and return array in PHP


I want to parse a URL. This is the code I got from a tutorial, so I am assuming it is correct:

public function parseUrl()
    {
        if(isset($_GET['url']))
        {
            return $url = explode('/', filter_var(rtrim($_GET['url'], '/'), FILTER_SANITIZE_URL));
        }
    }

I am using explode to return an array of every section of the URL string, like this:

Home/index/page

into

Array ( [0] => home [1] => index [2] => page )

I have read about rtrim, filter_sanitize_url and explode, but I don't know why the code above includes all three of them. Wouldn't sanitizing the string automatically trim the '/' character? Therefore, it would not be necessary to trim the '/', but only the whitespaces...(?)

In addition, if the explode delimeter is also '/', how does it still identify where the array "starts and ends" if '/' is supposedly deleted by both trim and sanitize_url?

Is it fine if I simly omit the rtrim charlist '/'? It still works as far aas I have tried.

I'm confused, I might be mixing up concepts, a small explanation would be appreciated.


Solution

  • ...if the explode delimeter is also '/', how does it still identify where the array "starts and ends" if '/' is supposedly deleted by both trim and sanitize_url?

    Explode can still use '/' as a delimiter because FILTER_SANITIZE_URL does not target '/' as a character to remove, and rtrim is only removing the slashes that exist as a last character in your url. FILTER_SANITIZE_URL has a complete list of character it strips from a string, but it really targets non-ASCII characters, and other such characters that should not belong in a url. The forward slash character does belong in a url, to designate pathing, so it is not removed here. rtrim does remove the forward slash character, as specified, but there's a difference between rtrim and removing all matching characters. rtrim is solely right trim, meaning the right most character of the input string is looked at and if a match is found based on the target character, it will be removed. As such, explode will still work as expected since all internal '/' characters are left intact after this filtering.