Search code examples
jsondatabaserealmsize

Realm database is larger than json dataset


I've got 2 json files, one being 5.4MB and the other 3.1MB. These json files each contain one (big) string array. I'm adding all these strings in a single realm database with the following schema:

const Word = {
    name: "Word",
    properties: {
        _id: "objectId",
        content: "string",
        language: "string",
    }
}

Before calling realm.compact(), the database weighs 39.5MB, and 17.9MB after.

The following code is used to insert the data:

 Realm.open(config).then(realm => {
        realm.write(() => {
            const dataPath = __dirname + "/../data/";
            const files = fs.readdirSync(dataPath);

            for (let file of files) {
                const json = getJsonFileContent(dataPath + file);
                const language = file.substring(0, file.indexOf(".json"));

                for (let entry of json) {
                    realm.create("Word", {
                        _id: new Realm.BSON.ObjectId(),
                        content: entry,
                        language: language
                    })
                }
            }
        })
        realm.compact();

My question is: How come the database is actually bigger than the dataset itself? Should I use another database? Or should I simply keep the json files and parse them directly? Or maybe try and limit the amount of entries?

FYI, this database was meant to be used in a mobile app, and the realm JS version used here is 12.7.0.


Solution

  • It seems the original data is roughly 10Mb and the new data is about 17Mb.

    There are a number of things that would account for that; indexes, metadata etc.

    Also, the Realm object shown has an addtional propety

    _id: "objectId",
    

    and whenever that object is instantiated a new objectId is created which is 12 bytes along with the associated json data mapped to the object

    realm.create("Word", {
       _id: new Realm.BSON.ObjectId(),
    

    So the code is actually adding data to the original data so the files would naturally be larger.

    For example; 1,000,000 objects * 12 bytes = 12M bytes of additional data (12 Mb)