I have an application with many messages. Every user is able to select one message in order to send this message to an other user. Finally this message will get an flag ('message was send to: user1, user2,...) Those send informations should be stored in mongoDB. Now I'm thinking about 2 different ways:
1.) many small documents in one collection
Every documents contains the message ID, the user name, who send this message and an Array of recipient, like this:
{
_id:'3DA5FC203,
sender:'username1',
recipient:['user1','user2','user3']
},
{
_id:'4AD290FC,
sender:'username1',
recipient:['user1','user2','user3']
},
{
_id:'4AD290FC,
sender:'usernameX',
recipient:['user2']
}
If 1000 users sends 10 messages a day to 1 ore more recipient, so if have 3.6 millions documents per year.
2.) less bigger documents in one collection
The other way would be less documents, but bigger one. For example one document for every message with the information about all the sender and recipient of this message. An mongoDB entry could look like that:
{
_id:'3DA5FC203,
'username1':['user1','user2','user3'],
},
{
_id:'4AD290FC,
'username1':['user1','user2','user3'],
'usernameX'['user2']
},
In this case: only 2 documents instead of 3 (example above), but one document could contain 100 or more sender.
So my question: which case will mongoDB handle better? Many small documents or less big? And which scenario is better for performing analyses, like: show all messages and recipient from one sender (username1)?
Using keys as values, like you do in:
'username1':['user1','user2','user3'],
is a bad idea as you can not do a indexed query where you look for documents that have a specific sender. This works:
db.messages.find( { 'username1' : { $exists: true } } );
But it is not going to be fast.
It is probably wise to keep your first option, with one document per message and sender. Then you can just do:
db.messages.find( { sender: 'username1' } );
Adding a new recipient to this document can be done with:
db.messages.update(
{ 'msgid' : '867896', sender: "username1" },
{ 'recipient': { $push: "user4" } }
);
You can make MongoDB use the same index for both queries as well, by having:
db.messages.ensureIndex( { sender: 1, msgid: 1 } );
Other hints
You need to be aware that you also can not have two documents with the same _id
value as you have in your first example. So you will have to make sure to add this ID as a different field than _id
. For example:
{
msgid:'3DA5FC203,
sender:'username1',
recipient:['user1','user2','user3']
},
And let MongoDB just create the _id
field for you.