Search code examples
javascriptnode.jsfile-uploadamazon-s3knox-amazon-s3-client

knox S3 upload corrupted or truncated file


This is a brain teaser question where I actually know the answer. I'm throwing a bounty on it because it represents a valuable Node programming safety tip (and that's the first hint).

  • Hint 2: In an HTTP request, what are the units of the "Content-Length" header field?

I'm using

var knox = require('knox');
var s3 = knox.createClient({
    key: ...,
    secret: ...,
    bucket: ...
});

// The bug is below:

var stringVal = JSON.stringify(<2d javascript array from a large spreadsheet>)

var req = s3.put(path + filename, {
    'Content-Length': stringVal.length,
    'Content-Type': 'application/json'
});
req.end(stringVal);

The resulting upload is either truncated or otherwise corrupted. We have stringVal.length === 322889, and the resulting S3 item size matches that. But downloading and reloading the file results in a string which has length 322140. No errors show up along the way until trying to JSON.parse the string which (predictably) results in a syntax error.

What's up?


Solution

  • From the source of the knox-module (https://github.com/LearnBoost/knox/blob/master/lib/client.js) you can learn that it uses standard http-requests.

    req.write and req.end converts strings from 'utf8' by default (http://nodejs.org/api/http.html#http_request_end_data_encoding).

    So what really happens is that you by accident cut off the end of the string by setting the string-length instead of the number of bytes in the 'Content-Length'-field. The server throws away everything longer than that; so when you parse the string you get an error.

    Quickest fix would be:

    'Content-Length': new Buffer(stringVal).length,
    

    Or even faster: just remove the 'Content-Length'-line.