I have a Node.js application using Node REST Client to make an HTTP GET request to a server, targeting a file in JSON format. Everything goes well when this file is encoded in UTF-8 without BOM.
However, the app crashes during the client.get call when the target file encoding is UTF-8 with BOM. Even when I wrap that call in a try / catch in an attempt to prevent the crash and get the error, I get this stacktrace:
events.js:188
throw err;
^
Error: Unhandled "error" event. (Error parsing response. response: [{}], error: [SyntaxError: Unexpected token in JSON at position 0])
at exports.Client.emit (events.js:186:19)
at C:\PFD\workspace\web_adherent\dev\eamnh-front\node_modules\node-rest-client\lib\node-rest-client.js:457:57
at Object.parse (C:\PFD\workspace\web_adherent\dev\eamnh-front\node_modules\node-rest-client\lib\nrc-parser-manager.js:140:17)
at ConnectManager.handleResponse (C:\PFD\workspace\web_adherent\dev\eamnh-front\node_modules\node-rest-client\lib\node-rest-client.js:538:32)
at ConnectManager.handleEnd (C:\PFD\workspace\web_adherent\dev\eamnh-front\node_modules\node-rest-client\lib\node-rest-client.js:531:18)
at IncomingMessage.<anonymous> (C:\PFD\workspace\web_adherent\dev\eamnh-front\node_modules\node-rest-client\lib\node-rest-client.js:678:34)
at emitNone (events.js:111:20)
at IncomingMessage.emit (events.js:208:7)
at endReadableNT (_stream_readable.js:1064:12)
at _combinedTickCallback (internal/process/next_tick.js:139:11)
What the code block doesn't show here that IntelliJ does is the U+FEFF zero width no-break space Unicode code point, marked by < X > in the following stack trace line: Error: Unhandled "error" event. (Error parsing response. response: [< X >{}], error: [SyntaxError: Unexpected token < X > in JSON at position 0])
. So what sems to happens is that the Client is trying to read the file content as a Unicode encoded String, instead of an UTF-8 JSON with no BOM. So it thinks the BOM is the U+FEFF Unicode character.
I have scoured SO and found quite a few questions about setting mimetypes for the Client but I still get the error. I have also read the node-rest-client docs and it seems that setting a response parser would be the way to go but scrolling to JSON parser shows that it is the same thing as setting mimetypes.
So I ended up with this:
const options ={
mimetypes:{
json:["application/json","application/json; charset=utf-8","application/json;charset=utf-8"]
}
};
const client = new Client(options);
Trying to set the charset to UTF-8 but the error is the same.
Does someone know what I am doing wrong or is this an issue with Node REST Client?
Thank you for your help.
-- Edit This is my code for the GET request function:
let Client = require('node-rest-client').Client;
const options ={
mimetypes:{
json:["application/json","application/json; charset=utf-8","application/json;charset=utf-8"]
}
};
const client = new Client(options);
// Reads file contents and calls callback function with data
exports.readFromUrl = (req, fileUrl, callback) => {
client.get(fileUrl, (data, resp) => {
if (resp.statusCode === 200) {
callback(data);
} else {
callback("");
}
}).on('error', (err) => {
callback("");
});
};
Just in case someone stumbles here because of a similar issue, I ended up replacing the Node REST Client JSON parser with a custom one which filters out invalid characters to pass a valid JSON to the callback.
Here's how I did it (using docs previously mentionned).
const Client = require('node-rest-client').Client;
const client = new Client();
// Remove existing regular parsers (otherwise JSON parser still gets called first)
client.parsers.clean();
client.parsers.add({
"name": "cleanInput",
"isDefault": false,
"match": function (response) {
// Match evey response to replace default parser
return true;
},
"parse": function (byteBuffer, nrcEventEmitter, parsedCallback) {
let parsedData = null;
try {
const cleanData = cleanString(byteBuffer.toString());
parsedData = JSON.parse(cleanData);
parsedData.parsed = true;
// Emit custom event
nrcEventEmitter('parsed', 'Data has been parsed ' + parsedData);
// Pass parsed data to client request method callback
parsedCallback(parsedData);
} catch(err) {
nrcEventEmitter('error', err);
}
}
});
// Only keeps unicode characters with codes lesser than 127 to avoid forbidden characters in JSON
function cleanString(input) {
let output = "";
for (let i=0; i < input.length; i++) {
if (input.charCodeAt(i) < 127) {
output += input.charAt(i);
}
}
return output;
}
https://stackoverflow.com/a/38036753/7316335
JSON parsers are specified to NOT accept Byte-Order marks.
Hence, your server is crashing due to a 'malformed' client GET request.
The issue should be resolved in your server's processing of the GET request, and not through a change in the JSON parser specification.
I would advise to filter Byte Order marks in all GET requests before parsing at the server.
in express how multiple callback works in app.get
This shows you how a single middleware can perform your pre-filtering of GET bodies before passing over to the actual callback for that GET path.