ruby 3.1.2 kubuntu 22.04
The method to extract the base domain name (excluding sub-domain) from a URI:
def host_name(a)
URI(a).host.sub(/\Awww\./, '')
end
Use:
uri_array = ['https://www.example.org', 'http://www.example.net/posts?a=1', 'www.example.com']
uri_array.map!(&method(:host_name))
expected output: =>
['example.org', 'example.net', 'example.com']
instead produces:
(irb):14:in `host_name': undefined method `sub' for nil:NilClass (NoMethodError)
with the resultant array modified like so:
[
[0] "example.org",
[1] "example.net",
[2] "www.example.com"
]
Why it should fail on the 3rd element of the array.
The problem here is not the use of &method
, but with the uri being passed to URI
. When the uri to be parsed doesn't contain a scheme (like http, https, etc), URI
will not be able to extract the host property:
irb(main):001> require 'uri'
=> true
irb(main):002> uri = URI('www.example.com')
=> #<URI::Generic www.example.com>
irb(main):003> uri.host
=> nil
irb(main):004> uri.path
=> "www.example.com"
irb(main):005> uri.host.sub(/\Awww\./, '')
(irb):5:in `<main>': undefined method `sub' for nil (NoMethodError)
uri.host.sub(/\Awww\./, '')
^^^^
from <internal:kernel>:187:in `loop'
from /Users/jason/.gem/ruby/3.3.0/gems/irb-1.12.0/exe/irb:9:in `<top (required)>'
from /Users/jason/.gem/ruby/3.3.0/bin/irb:25:in `load'
from /Users/jason/.gem/ruby/3.3.0/bin/irb:25:in `<main>'
irb(main):006>
However, you can use the addressable gem, and particularly Addressable::URI.heuristic_parse
for this situation instead.
Converts an input to a URI. The input does not have to be a valid URI — the method will use heuristics to guess what URI was intended. This is not standards-compliant, merely user-friendly.
irb(main):001> require 'addressable'
=> true
irb(main):002> uri = Addressable::URI.heuristic_parse('www.example.com')
=> #<Addressable::URI:0x2ee0 URI:http://www.example.com>
irb(main):003> uri.host
=> "www.example.com"
irb(main):004>