Search code examples
rubyparsinguriruby-1.9.2curly-braces

Parsing URIs that have curly braces, URI::InvalidURIError: bad URI(is not URI?)


Using ruby 1.9.2-p290. I came across an issue trying to parse a URI like the following:

require 'uri'
my_uri = "http://www.anyserver.com/getdata?anyparameter={330C-B5A2}"
the_uri = URI.parse(my_uri)

issuing the following error:

URI::InvalidURIError: bad URI(is not URI?)

I require a different solution than encoding the curly braces every time like this:

new_uri = URI.encode("http://www.anyserver.com/getdata?anyparameter={330C-B5A2}")
=> "http://www.anyserver.com/getdata?anyparameter=%7B330C-B5A2%7D"

Now I can parse the new_uri as usual, but had to do this every time I needed it. What is the simplest way to achieve this without doing it every time?

I post my own solution as I hadn't seen this exactly as I solved it.


# Accepts URIs when they contain curly braces
# This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
module URI
  def self.parse(uri)
    URI::Parser.new(:UNRESERVED => URI::REGEXP::PATTERN::UNRESERVED + "\{\}").parse(uri)
  end
end

Now I can use URI.parse(uri) with uri containing curly braces and no error is thrown.


Solution

  • # Need to not fail when uri contains curly braces
    # This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
    # DEFAULT_PARSER is used everywhere, so its better to override it once
    module URI
      remove_const :DEFAULT_PARSER
      unreserved = REGEXP::PATTERN::UNRESERVED
      DEFAULT_PARSER = Parser.new(:UNRESERVED => unreserved + "\{\}")
    end
    

    Following up the same issue, since DEFAULT_PARSER is used everywhere, its better to substitute it completely insted of just for the URI#parse method. Additionally this avoids allocating memory for the instantiation of a new Parser object every time.