Search code examples
rubyajaxhttpsphantomjsvcr

VCRProxy: Record PhantomJS ajax calls with VCR inside Capybara


I already did some research in this field, but didn't find any solution. I have a site, where asynchron ajax calls are made to facebook (using JSONP). I'm recording all my HTTP requests on the Ruby side with VCR, so I thought it would be cool, to use this feature for AJAX calls as well.

So I played a little bit around, and came up with a proxy attempt. I'm using PhantomJS as a headless browser and poltergeist for the integration inside Capybara. Poltergeist is now configured to use a proxy like this:

  Capybara.register_driver :poltergeist_vcr do |app|
    options = {
      :phantomjs_options => [
        "--proxy=127.0.0.1:9100",
        "--proxy-type=http",
        "--ignore-ssl-errors=yes",
        "--web-security=no"
      ],
      :inspector => true
    }
    Capybara::Poltergeist::Driver.new(app, options)
  end
  Capybara.javascript_driver = :poltergeist_vcr

For testing purposes, I wrote a proxy server based on WEbrick, that integrates VCR:

require 'io/wait'
require 'webrick'
require 'webrick/httpproxy'

require 'rubygems'
require 'vcr'

module WEBrick
  class VCRProxyServer < HTTPProxyServer
    def service(*args)
      VCR.use_cassette('proxied') { super(*args) }
    end
  end
end

VCR.configure do |c|
  c.stub_with :webmock
  c.cassette_library_dir = '.'
  c.default_cassette_options = { :record => :new_episodes }
  c.ignore_localhost = true
end

IP   = '127.0.0.1'
PORT = 9100

reader, writer = IO.pipe

@pid = fork do
  reader.close
  $stderr = writer
  server = WEBrick::VCRProxyServer.new(:BindAddress => IP, :Port => PORT)
  trap('INT') { server.shutdown }
  server.start
end

raise 'VCR Proxy did not start in 10 seconds' unless reader.wait(10)

This works well with every localhost call, and they get well recorded. The HTML, JS and CSS files are recorded by VCR. Then I enabled the c.ignore_localhost = true option, cause it's useless (in my opinion) to record localhost calls.

Then I tried again, but I had to figure out, that the AJAX calls that are made on the page aren't recorded. Even worse, they doesn't work inside the tests anymore.

So to come to the point, my question is: Why are all calls to JS files on the localhost recorded, and JSONP calls to external ressources not? It can't be the jsonP thing, cause it's a "normal" ajax request. Or is there a bug inside phantomjs, that AJAX calls aren't proxied? If so, how could we fix that?

If it's running, I want to integrate the start and stop procedure inside

------- UPDATE -------

I did some research and came to the following point: the proxy has some problems with HTTPS calls and binary data through HTTPS calls.

I started the server, and made some curl calls:

curl --proxy 127.0.0.1:9100 http://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png

This call gets recorded as it should. The request and response output from the proxy is

GET http://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png HTTP/1.1
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
Host: d3jgo56a5b0my0.cloudfront.net
Accept: */*
Proxy-Connection: Keep-Alive

HTTP/1.1 200 OK 
Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-10-12)
Date: Tue, 20 Nov 2012 10:13:10 GMT
Content-Length: 0
Connection: Keep-Alive

But this call doesn't gets recorded, there must be some problem with HTTPS:

curl --proxy 127.0.0.1:9100 https://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png

The header output is:

CONNECT d3jgo56a5b0my0.cloudfront.net:443 HTTP/1.1
Host: d3jgo56a5b0my0.cloudfront.net:443
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
Proxy-Connection: Keep-Alive

HTTP/1.1 200 OK 
Server: WEBrick/1.3.1 (Ruby/1.9.3/2012-10-12)
Date: Tue, 20 Nov 2012 10:15:48 GMT
Content-Length: 0
Connection: close

So, I thought maybe the proxy can't handle HTTPS, but it can (as long as I'm getting the output on the console after the cURL call). Then I thought, maybe VCR can't mock HTTPS requests. But using this script, VCR mocks out HTTPS requests, when I don't use it inside the proxy:

require 'vcr'

VCR.configure do |c|
  c.hook_into :webmock
  c.cassette_library_dir = 'cassettes'
end

uri = URI("https://d3jgo56a5b0my0.cloudfront.net/images/v7/application/stories_view/icons/bug.png")

VCR.use_cassette('https', :record => :new_episodes) do
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE
  response = http.request_get(uri.path)
  puts response.body
end

So what is the problem? VCR handles HTTPS and the proxy handles HTTPS. Why they don't play together?


Solution

  • So I did some research and now I have a very basic example of a working VCR proxy server, that handles HTTPS calls as a MITM proxyserver (if you deactivate the security check in your client). I would be very happy if someone could contribute and help me to bring this thing to life.

    Here is the github repo: https://github.com/23tux/vcr_proxy