my urls in my site'll have international characters such as ş
, ğ
, ı
...
after reading lots of posts and blogs for url validation issue, I decided to support filter_var($url, FILTER_VALIDATE_URL)
with another pieces of codes since
then I concluded to use this idea at PHP validation/regex for URL
bažmegakapa offers to use
if (preg_match("#^https?://.+#", $link) and @fopen($link,"r")) echo "OK";
to see if link can be opened then it means it's validated.
After this point MY question:
Question - I loved this idea & it seems to me very brilliant. But after seeing that it has only +7 and that page has +>>7 answers, I want to ask that what is the idea of php masters who will be glad to answer this question to help like the ones like me; the rookies.
Is there any weaknesses in bažmegakapa's code? for example I don't know but can there be any url that fopen can't open it but actually it's a harmless, must-be-validated url? So what is the cure of the weaknesses you detected?
thank you
best regards
The fact, that filter_var($url, FILTER_VALIDATE_URL)
considers javascript://test%0Aalert(321)
valid is not a weakness. If you think it is, your expectations about what filter_var
is for are wrong.
filter_var($url, FILTER_VALIDATE_URL)
validates the syntax of a URL against RFC 2396.
It is not meant to determine whether the resource pointed to by the URL is accesssible.
It is not meant to determine whether it is safe to use the URL as the value of a href
attribute in an a
element of a HTML document when the URL is provided by a user.
It is not meant to consider the scheme (which may place restrictions on URLs that go beyond what is described in RFC 2396). For example while
ftp://foo:bar@baz
is a valid FTP URL according to RFC 1738, 3.2 FTP,http://foo:bar@baz
is not a valid HTTP URL according to RFC 2616, 3.2.2 http URL (even though some browsers can interpret such "URLs").filter_var
does not bake cakes, nor does it brew coffee. If you require cake or coffee, use something else (RFC 2324 is a good start).
Depending on the circumstances, displaying a URL wich points to a resource that your server cannot access might be a good idea or a bad idea. Depending on the circumstances, displaying a URL that does not point to HTTP or HTTPS resource might be a good idea or a bad idea. One size does not fit all.