I am currently trying to save an js object with some binary data and other values. The result should look something like this:
{
"value":"xyz",
"file1":"[FileContent]",
"file2":"[LargeFileContent]"
}
Till now I had no binary data so I saved everything in JSON. With the binary data I am starting to run into problems with large files (>1GB).
I tried this approach: JSON.stringify or how to serialize binary data as base64 encoded JSON? Which worked for smaller files with around 20MB. However if I am using these large files then the result of the FileReader is always an empty string. The result would look like this:
{
"value":"xyz:,
"file1":"[FileContent]",
"file2":""
}
The code that is reading the blobs is pretty similar to the one in the other post:
const readFiles = async (measurements: FormData) => {
setFiles([]); //This is where the result is beeing stored
let promises: Array<Promise<string>> = [];
measurements.forEach((value) => {
let dataBlob = value as Blob;
console.log(dataBlob); //Everything is fine here
promises.push(
new Promise((resolve, reject) => {
const reader = new FileReader();
reader.readAsDataURL(dataBlob);
reader.onloadend = function () {
resolve(reader.result as string);
};
reader.onerror = function (error) {
reject(error);
};
})
);
});
let result = await Promise.all(promises);
console.log(result); //large file shows empty
setFiles(result);
};
Is there something else I can try?
Since you have to share the data with other computers, you will have to generate your own binary format.
Obviously you can make it as you wish, but given your simple case of just storing Blob objects with a JSON string, we can come up with a very simple schema where we first store some metadata about the Blobs we store, and then the JSON string where we replaced each Blob with an UUID.
This works because the limitation you hit is actually on the max length a string can be, and we can .slice() our binary file to read only part of it. Since we never read the binary data as string we're fine, the JSON will only hold a UUID in places where we had Blobs and it shouldn't grow too much.
Here is one such implementation I made quickly as a proof of concept:
/*
* Stores JSON data along with Blob objects in a binary file.
* Schema:
* 4 first bytes = # of blobs stored in the file
* next 4 * # of blobs = size of each Blob
* remaining = JSON string
*
*/
const hopefully_unique_id = "_blob_"; // <-- change that
function generateBinary(JSObject) {
let blobIndex = 0;
const blobsMap = new Map();
const JSONString = JSON.stringify(JSObject, (key, value) => {
if (value instanceof Blob) {
if (blobsMap.has(value)) {
return blobsMap.get(value);
}
blobsMap.set(value, hopefully_unique_id + (blobIndex++));
return hopefully_unique_id + blobIndex;
}
return value;
});
const blobsArr = [...blobsMap.keys()];
const data = [
new Uint32Array([blobsArr.length]),
...blobsArr.map((blob) => new Uint32Array([blob.size])),
...blobsArr,
JSONString
];
return new Blob(data);
}
async function readBinary(bin) {
const numberOfBlobs = new Uint32Array(await bin.slice(0, 4).arrayBuffer())[0];
let cursor = 4 * (numberOfBlobs + 1);
const blobSizes = new Uint32Array(await bin.slice(4, cursor).arrayBuffer())
const blobs = [];
for (let i = 0; i < numberOfBlobs; i++) {
const blobSize = blobSizes[i];
blobs.push(bin.slice(cursor, cursor += blobSize));
}
const pattern = new RegExp(`^${hopefully_unique_id}\\d+$`);
const JSObject = JSON.parse(
await bin.slice(cursor).text(),
(key, value) => {
if (typeof value !== "string" || !pattern.test(value)) {
return value;
}
const index = +value.replace(hopefully_unique_id, "") - 1;
return blobs[index];
}
);
return JSObject;
}
// demo usage
(async () => {
const obj = {
foo: "bar",
file1: new Blob(["Let's pretend I'm actually binary data"]),
// This one is 512MiB, which is bigger than the max string size in Chrome
// i.e it can't be stored in a JSON string in Chrome
file2: new Blob([Uint8Array.from({ length: 512*1024*1024 }, () => 255)]),
};
const bin = generateBinary(obj);
console.log("as binary", bin);
const back = await readBinary(bin);
console.log({back});
console.log("file1 read as text:", await back.file1.text());
})().catch(console.error);