Cloudflare's own globally distributed data store – Workers KV – can accept data of three "types": string
, ArrayBuffer
and ReadableStream
.
While the use cases for the former two are clear enough, I am struggling to figure out how stored ReadableStream
could be useful. I am familiar with the concept: using it you can "stream" different values over time, but what is the deal to put this in the data store? What are typical scenarios?
The difference between passing a string
, ArrayBuffer
, or ReadableStream
is not what data is stored, but rather how the data gets there. Note that you can store data as a string
and then later read it as an ArrayBuffer
or vice versa (string
s are converted to/from bytes using UTF-8). When you pass a ReadableStream
to put()
, the system reads data from the stream and stores that data; it does not store the stream itself. Similarly, when using get()
, you can specify "stream"
as the second parameter to get a ReadableStream
back; when you read from this stream, it will produce the value's content.
The main case where you would want to use streams is when you want to directly store the body of an HTTP request into a KV value, or when you want to directly return a KV value as the body of an HTTP response. Using streams in these cases avoids the need to hold the entire value in memory all at once; instead, bytes can stream through as they arrive.
For example, instead of doing:
// BAD
let value = await request.text();
await kv.put(key, value);
You should do this:
// GOOD
await kv.put(key, request.body);
This is especially important when the value is many megabytes in size. The former version would read the entire value into memory to construct one large string
(including decoding UTF-8 to UTF-16), only to immediately write that value back out into a KV (converting UTF-16 back to UTF-8). The latter version copies bytes straight from the incoming connection into KV without ever storing the whole value in memory at once.
Similarly, for a response, instead of doing:
// BAD
let value = await kv.get(key);
return new Response(value);
You can do:
// GOOD
let value = await kv.get(key, "readableStream");
return new Response(value);
This way, the response bytes get streamed from KV to the HTTP connection. This not only saves memory and CPU time, but also means that the client starts receiving bytes faster, because your Worker doesn't wait until all bytes are received before it starts forwarding them.