I've got 2 json files, one being 5.4MB and the other 3.1MB. These json files each contain one (big) string array. I'm adding all these strings in a single realm database with the following schema:
const Word = {
name: "Word",
properties: {
_id: "objectId",
content: "string",
language: "string",
}
}
Before calling realm.compact()
, the database weighs 39.5MB, and 17.9MB after.
The following code is used to insert the data:
Realm.open(config).then(realm => {
realm.write(() => {
const dataPath = __dirname + "/../data/";
const files = fs.readdirSync(dataPath);
for (let file of files) {
const json = getJsonFileContent(dataPath + file);
const language = file.substring(0, file.indexOf(".json"));
for (let entry of json) {
realm.create("Word", {
_id: new Realm.BSON.ObjectId(),
content: entry,
language: language
})
}
}
})
realm.compact();
My question is: How come the database is actually bigger than the dataset itself? Should I use another database? Or should I simply keep the json files and parse them directly? Or maybe try and limit the amount of entries?
FYI, this database was meant to be used in a mobile app, and the realm JS version used here is 12.7.0
.
It seems the original data is roughly 10Mb and the new data is about 17Mb.
There are a number of things that would account for that; indexes, metadata etc.
Also, the Realm object shown has an addtional propety
_id: "objectId",
and whenever that object is instantiated a new objectId is created which is 12 bytes along with the associated json data mapped to the object
realm.create("Word", {
_id: new Realm.BSON.ObjectId(),
So the code is actually adding data to the original data so the files would naturally be larger.
For example; 1,000,000 objects * 12 bytes = 12M bytes of additional data (12 Mb)