Search code examples
rubyseleniumphantomjscapybarapoltergeist

Prevent phantomjs from raising Capybara::Poltergeist::StatusFailError when requesting never ending assets


I am having some issues with Capybara::Poltergeist::Driver

When I visit the the following url with poltergeist, I am exerpiencing an issue where an asset that seemingly doesn't exist takes for ever to load and eventually an error gets raised: https://www.feinstein.senate.gov/public/index.cfm/e-mail-me

$ brew install phantomjs
$ gem install capybara -v 2.17.0
$ gem install poltergeist -v 1.7.0
$ gem install selenium-webdriver -v 2.53.4

Then in irb:

require 'capybara/poltergeist'

module Drivers
  class Poltergeist < Capybara::Poltergeist::Driver
    def needs_server?
      false
    end
  end
end

Capybara.register_driver :poltergeist_errorless do |app|
  options = ['--load-images=no', '--ignore-ssl-errors=yes', '--ssl-protocol=any', '--disk-cache=true', '--max-disk-cache-size=500000']
  Drivers::Poltergeist.new(app, js_errors: false, phantomjs_options: options)
end

session = Capybara::Session.new(:poltergeist_errorless)
session.visit('https://www.feinstein.senate.gov/public/index.cfm/e-mail-me')

After 10-20 seconds, the request fails, and I get back a Capybara::Poltergeist::StatusFailError exception with a message that says:

Request to 'https://www.feinstein.senate.gov/public/index.cfm/e-mail-me' failed to reach server, check DNS and/or server status - Timed out with the following resources still waiting https://sdc1.senate.gov/NEED_VALUE/wtid.js

But if I then call:

session.save_screenshot('/tmp/sc.png', full: true)

the outputted screenshot is shows that the rest of the page loaded just fine. If this were any other browser, it would just continue to function happily without worrying about an asset that is taking forever to load.

Is there anyway to configure phantomjs to not wait for this asset and to not raise this exception?


Solution

  • The easiest way to deal with that is to use Poltergeists blacklist to block the url - https://github.com/teampoltergeist/poltergeist#customization - and/or - https://github.com/teampoltergeist/poltergeist#url-blacklisting--whitelisting

    If your situation is more dynamic you could rescue the exception, parse out the URL, add it to the blacklist, and then retry the visit.

    Additionally, there is no need to override needs_server?. If you don't pass a second parameter (the app to run) to Session#new (which you aren't doing) then needs_server? is irrelevant.