Search code examples
node.jsfetch-apindjson

Get properties of a JSON object streamed as NDJSON using fetch


I am trying to get NDJSON data from an API using fetch. Since I only need one JSON object, I would like to do this using fetch. The data provided by the API is in the form (formatting by me, actual response is a single line):

{
  "a": "value",
  "b": "value",
  "c": "value",
  "d": "value",
  "e": "value"
}

When I simply log the data, everything works fine and I get the above response as an object:

const obj = await fetch(url, {
  method: "GET"
}).then(res => res.json());
console.log(obj);

But when I try to log one of the properties of that object, nothing gets logged (no errors):

const obj = await fetch(url, {
  method: "GET"
}).then(res => res.json());
console.log(obj.a);

Even logging JSON.stringify(obj) does not work. How can I get around this issue?


Solution

  • I'd like to drill down into what happened and point out couple problems you encountered - which probably made the whole thing much harder to solve than it should be.

    1. I suspect that the API kept sending more than one line of data regardless of what you told,
    2. ndjson is not a standard JSON string and fails,
    3. Promises in node tend to fail silently if you don't add proper handling.

    The three issues caused the result to be <nothing> while it should be an error explaining that the file cannot be parsed.

    The solution I offered was to use fetch with scramjet like this:

    const {StringSteram} = require("scramjet");
    
    const stream = StringStream
      .from(async () => (await fetch(url)).body)
      .JSONParse()
    ;
    

    StringStream.from accepts a stream or a method that returns one, and then the magic happens in JSONParse

    • the method takes every line apart
    • then parses that line as json

    So now stream is a flowing list of objects. In node >= 12 you can simply iterate over it in a loop:

    for await (const item of stream) {
       console.log(item);
    }
    

    And since the resulting stream class has some creature comfort functions if you just want the data as an Array:

    console.log(await stream.toArray());
    

    You don't have to use scramjet and you can work it out with just existing modules like this:

    const { createInterface } = require("readline");
    const { PassThrough } = require("stream");
    
    const input = (await fetch(url)).body;
    const output = new PassThrough();
    createInterface({ input, output });
    
    for await (const line of output) {
       console.log(JSON.parse(line));
    }
    

    Both solutions will take you there - with scramjet you can add more processing to the stream like: stream.filter(item => checkValid(item)) or whatever you may need, but in the end the goal can be reached either way.