Search code examples
rubyproxyopen-uri

Open-uri throwing error => (URI::InvalidURIError)


I have a program that I'm using for testing purposes what I'm doing is scraping the web for open proxies, and logging information of them, however this is a very different type of proxy scraper, as it creates a bunch of random proxies before inside of a file before executing for example:

def create_possibles
  puts "Creating random possible proxies..".green.bold
  1.times do 
    port = rand(2000..8080)
    1.times do 
      ip = Array.new(4){rand(256)}.join('.')
      possible_proxy = "#{ip}:#{port}"
      File.open("possible_proxies.txt", "a") {|s| s.puts(possible_proxy)}
    end
  end
end
#<= 189.96.49.87:7990

What I want to do with that "possible proxy" is open it and see if it works, however when I use the following code it just throws that error:

def check_possibles
  IO.read("possible_proxies.txt").each_line do |proxy|
    puts open('http://google.com', :proxy => "http://#{proxy}")
  end
end

I have two questions:

  1. Does that mean that the proxy is invalid, and if so is there a way to skip over the line in the file? Possibly by using a next or skip?
  2. If that doesn't mean the proxy is invalid, then what does it mean, am I doing something wrong within my code to where it's reading the url wrong?

Full error:

C:/Ruby22/lib/ruby/2.2.0/uri/rfc3986_parser.rb:66:in `split': bad URI(is not URI
?): http://189.96.49.87:7990 (URI::InvalidURIError)

EDIT:

I was told to try URI.parse and I get the same error:

C:/Ruby22/lib/ruby/2.2.0/uri/rfc3986_parser.rb:66:in `split': bad URI(is not URI
?): http://195.239.61.210:4365 (URI::InvalidURIError) #<= Different IP

Solution

  • When you iterate over each line in ruby using #each_line, it gives you each line including the newline. Ruby's URI lib doesn't like the newline. Simply replace

    :proxy => "http://#{proxy}"
    

    with

    :proxy => "http://#{proxy.chomp}"
    

    String#chomp will cut off any newlines at the end of the string.