I am using MongoDB 3.2.5. There are 2 collections, both contain about 2 millions documents
Devices
{
_id: xx
}
Interactions
{
_id: yy
StartDateTime: 2016-10-24 17:21:30.989Z
DeviceId: xx
}
I want to find all devices that has no reference from interactions. I have tried the following code, it works with a small DB but fails on a large DB.
var matches = db.Interactions.find({}, { DeviceId: 1 });
var devicesIds = [];
matches.forEach(function(match) { devicesIds.push(match.DeviceId) });
var count = db.Devices.find({ "_id": { $nin : devicesIds } } ).count();
print(count);
It throws error message:
[thread1] Error: BufBuilder attempted to grow() to 134217728 bytes, past the 64MB limit.
I also tried:
db.Devices.aggregate([
{
$lookup:
{
from: "Interactions",
localField: "_id",
foreignField: "DeviceId",
as: "matched_docs"
}
},
{
$match: { "matched_docs": { $eq: [] } }
},
{
$out: "TempDevicesNoInteraction"
}
]);
The query run 3 hours but still not finish. I have to cancel it. Same for query below:
var count = 0;
db.Devices.find().forEach(function(myDoc) {
var cursor = db.Interactions.find({DeviceId: myDoc._id});
if(!cursor.hasNext()) {
count = count + 1;
}
});
print(count);
I am new to MongoDB, please guide me.
Do you have a Index set on the "DeviceId" field in Interactions collection? If not the Aggregation with the $lookup operator does a collectionscan (on 2million documents) for each of your 2million documents....
So please make sure there is an index on "DeviceId".
Do you plan on running this query often? When the answer is yes, you could store the results of you aggregation in a new collection using the $out operator. This way the initial fill can takes some time but each query on that data performs well. But it's an consideration you have to make.