How can I test if a string is URL encoded?
Which of the following approaches is better?
function is_urlEncoded($string){
$test_string = $string;
while(urldecode($test_string) != $test_string){
$test_string = urldecode($test_string);
}
return (urlencode($test_string) == $string)?True:False;
}
$t = "Hello World > how are you?";
if(is_urlEncoded($sreq)){
print "Was Encoded.\n";
}else{
print "Not Encoded.\n";
print "Should be ".urlencode($sreq)."\n";
}
The above code works, but not in instances where the string has been doubly encoded, as in these examples:
$t = "Hello%2BWorld%2B%253E%2Bhow%2Bare%2Byou%253F";
$t = "Hello+World%2B%253E%2Bhow%2Bare%2Byou%253F";
You'll never know for sure if a string is URL-encoded or if it was supposed to have the sequence %2B
in it. Instead, it probably depends on where the string came from, i.e. if it was hand-crafted or from some application.
Is it better to search the string for characters which would be encoded, which aren't, and if any exist then its not encoded.
I think this is a better approach, since it would take care of things that have been done programmatically (assuming the application would not have left a non-encoded character behind).
One thing that will be confusing here... Technically, the %
"should be" encoded if it will be present in the final value, since it is a special character. You might have to combine your approaches to look for should-be-encoded characters as well as validating that the string decodes successfully if none are found.