Search code examples
cloudflare-workershttp-compression

Brotli Decompress Webstreams (in Cloudflare Workers)


I have a bunch of data that I'm storing as files on Cloudflare's R2. I very early on noticed that these data files were approaching a bucket size of terrabytes so applied brotli compression which brought the size down to ~500mb.

I am now trying to expose the data via workers (to apply a filter) and have hit a snag. Cloudflare exposes WebStreams which has DecompressionStream which can decompress gzip, but not brotli.

I did convert the stream to gzip ...

let stm = resp.body
  .pipeThrough(new DecompressionStream("gzip"))
  .pipeThrough(ApplyFilter(sDate, eDate))
  .pipeThrough(new CompressionStream("gzip"))
;

Gzip is not offering nearly the level of compression I got used to with Brotli.

261M   1158172.data    (100%)
2.8M   1158172.data.gz (  1%)
 78K   1158172.data.br (  0.03%)

So,

  1. Is there a brotli decompress for webstreams products?
    • I've always relied on the browser to just handle this
  2. Is there a way I can trick my Worker or R2 into auto decompressing?
    • All the browsers support it. Can I hook into that somehow?
  3. Should I just pass the whole thing to the browser to do the work?
    • I was hoping to avoid this because I want the server to control the data exposure
  4. Something else I haven't thought of?

UPDATE

I forgot to mention having tried to convert to Node streams and using node's zlib.createBrotliDecompress. Unfortunatly, it does not appear that Cloudflare supports zlib in workers

Uncaught Error: No such module "node:zlib".

Solution

  • Is there a brotli decompress for webstreams products?

    There is no support for Brotli in the (de)CompressionStream standard, but you could probably do it with WebAssembly.

    Is there a way I can trick my Worker or R2 into auto decompressing?

    Cloudflare will handle on-the-fly decompression itself if the client's Accept-Encoding header doesn't indicate support for what is shown on your response's Content-Encoding header.

    Just return the compressed file as-is, with the appropriate Content-Encoding header.

    export default {
      async fetch(req, env, ctx) {
        const obj = await env.R2.get('result.br');
    
        return new Response(obj.body, {
          headers: {
            'Content-Encoding': 'br'
          },
          encodeBody: 'manual'
        });
      },
    };
    
    curl https://xxx.xxx.workers.dev/ --header "Accept-Encoding: br" --output - -vvv
    < Content-Length:15
    < Content-Encoding: br
    <binary content>
    
    curl https://xxx.xxx.workers.dev/ --header "Accept-Encoding: identity" -vvv
    <no content-length header, due to on-the-fly decompression>
    <plain-text content>