Search code examples
javascriptnode.jstypescriptexpress

How to send the fetched URL when piping request?


I've set up a small proxy using Node and ExpressJS. Basically, the user sends a request to http://localhost:3000?url=https://example.org and the proxy fetches the specified URL as a request with the axios module. The user chooses the method, body and headers to be used by placing them in the request to the proxy.

Here is the code:

const express = require('express');
const axios = require('axios');
import { Request as ExpressRequest, Response as ExpressResponse } from 'express';

const VERBOSE = true;

const FORBIDDEN_REQ_HEADERS = [
    "host", "x-forwarded-for", "x-forwarded-host", "x-forwarded-proto", "x-forwarded-port", "forwarded"
];

const FORBIDDEN_RES_HEADERS = [
    "content-length", "content-encoding", "transfer-encoding", "content-security-policy-report-only", "content-security-policy"
];

const DEFAULT_RES_HEADERS = {
    "access-control-allow-origin": "*",
    "access-control-allow-headers": "*",
    "access-control-expose-headers": "*",
    "access-control-allow-methods": "*",
    "access-control-allow-credentials": "true"
};

const app = express();

app.all('/', async (req: ExpressRequest, res: ExpressResponse) => {
    const url = new URL(req.url ? req.url : '', "https://" + req.headers.host).searchParams.get("url") || "";

    // test whether the URL provided as a query is valid or not
    try {
        new URL(url);
    } catch (err: any) {
        console.error(err.message);
        res.writeHead(400).end(JSON.stringify({ error: 'Invalid URL' }));
        return;
    }

    // construction of the headers object with allowed request headers
    const reqHeaders = Object.fromEntries(
        Object.entries(req.headers)
            .filter(([key]) => !FORBIDDEN_REQ_HEADERS.includes(key.toLowerCase()))
            .map(([key, value]) => [key, Array.isArray(value) ? value.join(', ') : value ?? ''])
    );

    // construction of body as a buffer
    const body = req.method === "GET" || req.method === "HEAD" ? undefined : await new Promise<Buffer>((resolve, reject) => {
        const chunks: Uint8Array[] = [];
        req.on('data', chunk => chunks.push(chunk));
        req.on('end', () => resolve(Buffer.concat(chunks)));
        req.on('error', reject);
    });

    if (VERBOSE) {
        console.log(`\n==> Sending ${req.method} request to ${url}`);
        console.log(`> with headers ${JSON.stringify(reqHeaders)}`);
        if (body) {
            try {
                console.log(`> with body ${JSON.stringify(JSON.parse(body.toString()))}`);
            } catch (err) {
                console.log(`> with body ${body.toString()}`);
            }
        }
    }

    try {
        const proxyResponse = await axios({
            url,
            method: req.method,
            headers: reqHeaders,
            data: body,
            responseType: 'stream'
        });

        // construction of response headers object with allowed response headers
        const resHeaders: Record<string, string> = {};
        for (const [key, value] of Object.entries(proxyResponse.headers)) {
            if (!FORBIDDEN_RES_HEADERS.includes(key.toLowerCase())) {
                resHeaders[key.toLowerCase()] = Array.isArray(value) ? value.join(', ') : value as string;
            }
        }

        // headers like access-control-allow-origin must not be set by the proxy request
        Object.assign(resHeaders, DEFAULT_RES_HEADERS);

        if (VERBOSE) {
            console.log('\n==> Response received:');
            console.log(`> with url ${proxyResponse.request.res.responseUrl}`)
            console.log(`> with status ${proxyResponse.status}`);
            console.log(`> with headers ${JSON.stringify(resHeaders)}`);
        }

        res.writeHead(proxyResponse.status, resHeaders);
        proxyResponse.data.pipe(res);

    } catch (error: any) {
        console.error(error);
        res.writeHead(500).end(JSON.stringify({ error: error.message }));
    }
});

const PORT = 3000;
app.listen(PORT, () => {
    if (VERBOSE) console.log(`Proxy server is running on port ${PORT}!`);
});

module.exports = app;

When the client requests the proxy from an external script using fetch, for example, Response.url is set to http://localhost:3000?url=https://example.org, which is normal behavior, and we could simply use Response.url.replace('http://localhost:3000?url=', '') to retrieve the correct URL.

However, this doesn't work with redirects: i.e., when the user has sent a request that requires one or more redirects, it returns the original URL to the client instead of the final URL that can be accessed at proxyResponse.request.res.responseUrl.

Here's an example to illustrate what I've explained above. Let's say the user wants to fetch https://httpbin.org/redirect-to?url=https://example.org, which causes a redirect to https://example.org. Using the proxy, he will have to request the following URL: http://localhost:3000?url=https://example.org.

(async function () {
    const proxy = 'http://localhost:3000';

    const res = await fetch(`${proxy}?url=https://httpbin.org/redirect-to?url=https://example.org`, {
        method: 'GET',
        redirect: "follow"
    });

    console.log(res.url);
     // wanted https://example.org
     // got http://localhost:3000?url=https://httpbin.org/redirect-to?url=https://example.org
})();

Is there a way to make my proxy send a response whose URL is the final URL (e.g. https://example.org) instead of the URL requested by the user (e.g. http://localhost:3000?url=https://example.org)?


Solution

  • You cannot influence what goes into res.url where res is the response to a fetch request. This will always be the URL of your proxy.

    But you can write the effective fetched URL (after redirections) into a new header, like in this simplified1 version of your proxy:

    app.get("/", async function(request, response) {
      const r = await fetch(request.query.url);
      r.headers["X-Url"] = r.url; // set effective URL as header
      response.writeHead(r.status, r.statusText, r.headers);
      response.end(await r.text());
    });
    

    The response to http://localhost:3000/?url=https://httpbin.org/redirect-to?url=https://example.com then has a header

    X-Url: https://example.com/
    

    which your client can address as res.headers.get("X-Url").


    1 I used await r.text() instead of (more efficiently) piping the response, because this is irrelevant for your question about the fetched URL.