Search code examples
node.jshttpnode-streams

https.request returning incomplete stream


https.request returns correct data, but when I pipe the response into a file, it is incomplete, and missing a lot at the beginning.

How I am using https.request:

import client from "https";

const _request = (url, options, timeout) => {
    let data = '';
    const httpOptions = {};
    const parsedUrl = new URL(url);

    const hostname = parsedUrl.hostname;
    const port = parsedUrl.port;
    const path = parsedUrl.pathname;
    const params = parsedUrl.search;

    httpOptions["hostname"] = hostname;
    httpOptions["port"] = port;
    httpOptions["path"] = path + params;
    httpOptions["method"] = options.method.toUpperCase() || "GET";

    return new Promise((resolve, reject) => {
        const req = client.request(httpOptions, (res) => {
            if (options.stream === true) resolve(res);

            res.on("data", (chunk) => {
                data += chunk;
            });

            res.on("end", () => {
                resolve({
                    status: res.statusCode,
                    statusText: res.statusMessage,
                    data: data,
                });
            });

            res.on("error", (err) => {
                reject(err);
            });
        });

        req.end();
    });
};

How I'm trying to stream the response:

import client from "./index.js";
import { createWriteStream } from "fs";

const response = await client("an.api.that/returns-plain-text", {
    method: "GET",
    stream: true
}, 5000);

const writeStream = createWriteStream("./test.txt");
response.pipe(writeStream);

I've tried moving if (options.stream === true) resolve(res); inside req.on("end"), however that resulted in the same issue.

Thanks!


Solution

  • The statement res.on("data", ...) causes the response to start emitting data events immediately. If you write something like

    _request(url, {stream: true}, timeout)
    .then(function(res) {
      res.pipe(createWriteSteam(...));
    });
    

    then by the time the res.pipe statement is reached, res may already have emitted some data, which is then lost (so the beginning of the response will be missing, as you observed).

    To prevent the premature emitting of data events, you can explicitly pause the stream:

    if (options.stream === true) {
      res.pause();
      resolve(res);
    }