Is there a lot of overhead in excluding nearly all of the data in a document when querying a mongo database?
For example, in the case where I only want field1 and field2, for a collection with a document structure of:
{
"field1" : 1
"field2" : true
"field3" : ["big","array",...]
"field4" : ["another","big","array",...]
}
would I benefit more from:
Note: The inefficiency of saving the same data twice isn't a concern for me as much as the efficiency of querying the data
Many thanks!
Projection is somewhat similar to using column names explicitly in SQL, so it seems a little counter-intuitive to ask if returning smaller amount of data would incur overhead over returning larger amount of data (full document).
So you have to find the document (depending on how you .find() it may be fast or slow) but returning only first two fields of the document rather than all the fields (complete document) would make it faster not slower.
Having a second collection may only benefit if you are concerned about your collection fitting into RAM. If the documents in the duplicate collection are much smaller then they can presumably fit into a smaller amount of total RAM decreasing a chance that a page will need to be swapped in from disk. However, if you are writing to this collection as well as original collection then you have to have a lot more data in RAM than if you just have the original collection.
So while the intricate details may depend on your individual set-up, the general answer would probably be 2. you will benefit more from using projection and only returning the two fields you need.