I am using the following MongoDB aggregation to get number of articles for each day and the number of unique pinned articles :
Collection.aggregate([
{
$match: {
date: { $lt: myDate },
article: { $in: articles }
}
},
{
$group: {
_id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
nbArticles: { $sum: 1 },
distinctPinnedArticles: {
$cond: {
if: { $eq: ['$isPinned', 1] },
then: { $addToSet: "$articleID" },
else: null
}
}
}
},
{
},
{
$project: {
_id: 0,
nbArticles: 1,
distinctPinnedArticlesLength: { $size: { $setUnion: ["$distinctPinnedArticles"] } }
}
}
]
I know the condition does not work like this in MongoDB, I just want to illustrate what i want.
The idea is to apply to first filter on nbArticles
and then add a new filter isPinned == 1
to count the distinct articleIDs, I don't want to use facet because i am dealing with a huge number of data, so the facet can't handle a BSON with size > 16bytes.
The expected output :
[
{
nbArticles: 1000,
distinctPinnedArticles: 500
},
{
nbArticles: 600,
distinctPinnedArticles: 300
},
{
nbArticles: 1300,
distinctPinnedArticles: 800
},
{
nbArticles: 70,
distinctPinnedArticles: 40
}
]
Sample data:
[{
_id: ObjectId("6146b91a4e98146bfba070a9"),
article: 'article1',
articleID: 'article_1',
isPinned: 0,
date: ISODate("2022-02-15T18:24:37.000Z")
}, {
_id: ObjectId("6146b95c4e98146bfba070aa"),
article: 'article2',
articleID: 'article_2',
isPinned: 0,
date: ISODate("2021-12-01T10:45:22.000Z")
}
, {
_id: ObjectId("6146b9834e98146bfba070ab"),
article: 'article1',
articleID: 'article_3',
isPinned: 1,
date: ISODate("2022-01-05T08:12:59.000Z")
}
, {
_id: ObjectId("6146b9ae4e98146bfba070ac"),
article: 'article2',
articleID: 'article_4',
isPinned: 0,
date: ISODate("2021-11-10T14:30:45.000Z")
}
, {
_id: ObjectId("6146b9d54e98146bfba070ad"),
article: 'article1',
articleID: 'article_5',
isPinned: 1,
date: ISODate("2022-03-01T09:51:02.000Z")
}
, {
_id: ObjectId("6146ba014e98146bfba070ae"),
article: 'article3',
articleID: 'article_6',
isPinned: 1,
date: ISODate("2021-12-20T16:08:31.000Z")
}
, {
_id: ObjectId("6146ba2b4e98146bfba070af"),
article: 'article2',
articleID: 'article_7',
isPinned: 1,
date: ISODate("2022-01-25T12:57:09.000Z")
}
, {
_id: ObjectId("6146ba544e98146bfba070b0"),
article: 'article3',
articleID: 'article_8',
isPinned: 1,
date: ISODate("2022-02-10T17:39:18.000Z")
}
, {
_id: ObjectId("6146ba7c4e98146bfba070b1"),
article: 'article1',
articleID: 'article_9',
isPinned: 0,
date: ISODate("2021-11-30T08:20:57.000Z")
}]
Thank you in advance.
What about this one:
db.collection.aggregate([
{
$group: {
_id: { $dateTrunc: { date: "$date", unit: "day" } },
nbArticles: { $sum: 1 },
distinctPinnedArticles: {
$addToSet: {
$let: {
vars: {
pinnedArticleID: {
$cond: {
if: { $eq: ['$isPinned', 1] },
then: "$articleID",
else: null
}
}
},
in: "$$pinnedArticleID"
}
}
}
}
},
{
$set: {
distinctPinnedArticles: {
$filter: {
input: "$distinctPinnedArticles",
cond: "$$this"
}
}
}
}
])