Search code examples
rubybasic-authenticationeventmachinehttp-streaminggnip

em-http stream with basic auth and gzip hangs


I'm attempting to consume the Gnip PowerTrack API which requires me to connect to an HTTPS stream of JSON with basic auth. I feel like this should be fairly trivial so I'm hoping some rubyist who is smarter than me can point out my obvious mistake.

Here's relevant parts my ruby 1.9.3 code:

require 'eventmachine'
require 'em-http'
require 'json'

usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
GNIP_STREAMING_URL = 'https://stream.gnip.com:443/foo/bar/prod.json'

http = EM::HttpRequest.new(GNIP_STREAMING_URL)
EventMachine.run do
  s = http.get(:head => { 'Authorization' => [user, password], 'accept' => 'application/json', 'Accept-Encoding' => 'gzip,deflate' }, :keepalive => true, :connect_timeout => 0, :inactivity_timeout => 0)

  buffer = ""
  s.stream do |chunk|
    buffer << chunk
    while line = buffer.slice!(/.+\r?\n/)
      puts JSON.parse(line)
    end
  end
end

The stream connects (My Gnip dashboard repors a connection) but then just buffers and never outputs anything. In fact, it seems like it never enters the s.stream do.. block. Note that this is a GZip encoded stream.

Note that this works:

curl --compressed -uusername $GNIP_STREAMING_URL

EDIT: I'm sure this is kinda implicit, but I can't give out any login creds or the actual URL, so don't ask ;)

EDIT #2: yajl-ruby would probably work if I could figure out how to encode credentials for the URL (simple URL encoding doesn't seem to work as I fail authentication with Gnip).

EDIT #3: @rweald found that em-http does not support streaming gzip, I've created a GitHub issue here.

EDIT #4: I have forked and fixed this in em-http-request, you can point at my fork if you want to use em-http this way. The patch has been merged into the maintainer's repo and will be working in the next release.

EDIT #5: My fixes have been published in em-http-request 1.0.3, so this should no longer be an issue.


Solution

  • The problem lies within em-http-request. If you look at https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb

    You will notice that the GZIP decompressor can not do streaming decompression :( https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb#L100

    You would need to fix the underlying streaming gzip problem if you wanted to be able to read a stream using em-http-request