Search code examples
file-uploadcurlnginxtornadohttp-compression

Nginx + Tornado ( + curl): Inflate gzipped POST request


I have setup a server (well... two servers, but I don't think that is too relevant for this question) running Tornado (version 2.4.1) and being proxied by Nginx (version 1.4.4).

I need to periodically upload json (basically text) files to one of them through a POST request. These files would greatly benefit from gzip compression (I get compression ratios of 90% when I compress the files manually) but I don't know how to inflate them in a nice way.

Ideally, Nginx would inflate it and pass it clean an neat to Tornado... but that's not what's happening now, as you'll have probably guessed, otherwise I wouldn't be asking this question :-)

These are the relevant parts of my nginx.conf file (or the parts that I think are relevant, because I'm pretty new to Nginx and Tornado):

user  borrajax;
worker_processes  1;

pid    /tmp/nginx.pid;

events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;
    access_log  /tmp/access.log  main;
    error_log   /tmp/error.log;

    # Basic Settings

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;    
    gzip  on;
    gzip_disable "msie6";
    gzip_types        application/json text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript image/x-icon image/bmp;
    gzip_http_version 1.1;
    gzip_proxied expired no-cache no-store private auth;

    upstream web {
    server 127.0.0.1:8000;
    }

    upstream input {
        server 127.0.0.1:8200;
    }

    server {
        listen       80 default_server;
        server_name  localhost;
        location / {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;

            proxy_pass http://web;
        }
    }

    server {
        listen 81 default_server;
        server_name input.localhost;

        location / {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;

            proxy_pass http://input;
        } 
    }
}

As I mentioned before, there are two Tornado servers. The main one is running on localhost:8000 for the web pages and that kind of stuff. The one running on localhost:8200 is the one intended to receive those json files) This setup is working fine, except for the Gzip part.

I'd like for Nginx to inflate the gzipped requests that come to localhost:81, and forward them to the Tornado I have running on localhost:8200 (inflated)

With the configuration like this, the data reaches Tornado, but the body is still compressed, and Tornado throws an exception:

[E 140108 15:33:42 input:1085] Uncaught exception POST 
  /input/log?ts=1389213222 (127.0.0.1)
  HTTPRequest(
      protocol='http', host='192.168.0.140:81', 
      method='POST', uri='/input/log?&ts=1389213222', 
      version='HTTP/1.0', remote_ip='127.0.0.1', body='\x1f\x8b\x08\x00\x00', 
      headers={'Content-Length': '1325', 'Accept-Encoding': 'deflate, gzip', 
      'Content-Encoding': 'gzip', 'Host': '192.168.0.140:81', 'Accept': '*/*', 
      'User-Agent': 'curl/7.23.1 libcurl/7.23.1 OpenSSL/1.0.1c zlib/1.2.7', 
      'Connection': 'close', 'X-Real-Ip': '192.168.0.94', 
      'Content-Type': 'application/json'}
   )

I understand I can always get the request's body within the post() Tornado handler and inflate it manually, but that just sounds... dirty.

Finally, this is the curl call I use to upload the gzipped file:

curl --max-time 60 --silent --location --insecure \
    --write-out "%{http_code}" --request POST \
    --compressed \
    --header "Content-Encoding:gzip" \
    --header "Content-Type:application/json" \
    --data-binary "$log_file_path.gz" \
    "/input/log?ts=1389216192" \
    --output /dev/null \
    --trace-ascii "/tmp/curl_trace.log" \
    --connect-timeout 30

The file in $log_file_path.gz is generated using gzip $log_file_path (I mean... is a regular Gzip compressed file)

Is this something doable? It sounds like something that should be pretty straight forward, but nopes...

If this is is something not doable through Nginx, an automated method in Tornado would work too (something more reliable and elegant that having me uncompressing files in the middle of a POST request's handler) Like... something like Django middlewares or something like that?

Thank you in advance!!


Solution

  • You're already calling json.loads() somewhere (Tornado doesn't decode json for you so the exception you're seeing (but did not quote) must be coming from your own code); why not just replace that with a method that examines the Content-Encoding and Content-Type headers and decodes appropriately?