Search code examples
ruby-on-railsregexurl-validation

(rails) validating URL help with regexp


i'm using the following to verify if a URL is formatted validly:

validates_format_of :website, :with => URI::regexp(%w(http https))

however, it doesn't work when the url doesn't start with http:// or https://. Is there some similar way to validate URLs with URI::regexp (or URI) and make it include valid URLs that don't start with http://? (For example, www.google.com is valid, as is http://www.google.com)


Solution

  • This post on Daring Fireball provides a robust regex:

    \b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
    

    A more recent post improves on it (N.B. newlines and indentation added here for clarity; see the post for an even more expanded version with explanations):

    (?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|
               www\d{0,3}[.]|
               [a-z0-9.\-]+[.][a-z]{2,4}/)
           (?:[^\s()<>]+|
              \(([^\s()<>]+|
              (\([^\s()<>]+\)))*\))+
           (?:\(([^\s()<>]+|
              (\([^\s()<>]+\)))*\)|
              [^\s`!()\[\]{};:'".,<>?«»“”‘’]))
    

    From my tests URL::regexp is to loose in its definition of a URI (though it does require http…).

    You can use a virtual attribute or before_save filter to append a http:// to your URLs if necessary.