Search code examples
javamongodbaggregateprojectionmongodb-java

How to project more than the first sub-document when using $elemMatch


I have a collection with documents of the following form:

{
    "_id" : { "$oid" : "67bg............"},
    "ID"  : "xxxxxxxx",
    "senses" : [
        {
            "word"   : "hello",
            "lang"   : "EN",
            "source" : "EN_DICTIONARY"
        },
        {
            "word"   : "coche",
            "lang"   : "ES",
            "source" : "ES_DICTIONARY"
        },
        {
            "word"   : "bye",
            "lang"   : "EN",
            "source" : "EN_DICTIONARY"
        }
    ]
}

I want to find all documents that match at least one sense with lang=X and source=Y and return the matched Documents with only those senses which match lang=X and source=Y.

I tried this:

DBObject sensesQuery = new BasicDBObject();
sensesQuery.put("lang", "EN");
sensesQuery.put("source", "EN_DICTIONARY");
DBObject matchQuery = new BasicDBObject("$elemMatch",sensesQuery);

DBObject fields = new BasicDBOject();
fields.put("senses",matchQuery);

DBObject projection = new BasicDBObject();
projection.put("ID",1)
projection.put("senses",matchQuery);
DBCursor cursor = collection.find(fields,projection)

while(cursor.hasNext()) {
    ...
}

My query works for matching documents, but not for the projection. Taking the above document as an example, if I run my query I get this result:

{
    "_id" : { "$oid" : "67bg............"},
    "ID"  : "xxxxxxxx",
    "senses" : [
        {
            "word"   : "hello",
            "lang"   : "EN",
            "source" : "EN_DICTIONARY"
        }
    ]
}

But I want this :

{
    "_id" : { "$oid" : "67bg............"},
    "ID"  : "xxxxxxxx",
    "senses" : [
        {
            "word"   : "hello",
            "lang"   : "EN",
            "source" : "EN_DICTIONARY"
        },
        {
            "word"   : "bye",
            "lang"   : "EN",
            "source" : "EN_DICTIONARY"
        }
    ]
}

I read about aggregation but I did not understand how to use it in the MongoDB Java driver.

Thanks


Solution

  • You are using the $elemMatch operator on the projection aswell as on the filter.

    From the docs

    The $elemMatch operator limits the contents of an field from the query results to contain only the first element matching the $elemMatch condition.

    So, the behaviour you are seeing is the expected behaviour for elemMatch-in-a-projection.

    If you want to project all sub documents in the senses array within documents which match the filter condition then you could use this:

    projection.put("senses", 1);
    

    But, if you want to project only those sub documents which match your filter condition then $elemMatch will not work for you since it only ever returns the first element matching the $elemMatch condition. Your alternative is to use the aggregation framework, for example:

    db.collection.aggregate([
      // matches documents with a senses sub document having the given lang and source values
      {$match: {'senses.lang': 'EN', 'senses.source': 'EN_DICTIONARY'}},
    
      // projects on the senses sub document and filters the output to only return sub 
      // documents having the given lang and source values
      {$project: {
          senses: {
            $filter: {
                input: "$senses",
                as: "sense",
                cond: { $eq: [ "$$sense.lang", 'EN' ], $eq: [ "$$sense.source", 'EN_DICTIONARY' ] }
              }
            }
          }
      }
    ])
    

    Here's that aggregation call using the MongoDB Java driver:

    Document filter = new Document("senses.lang", "EN").append("senses.source", "EN_DICTIONARY");
    
    DBObject filterExpression = new BasicDBObject();
    filterExpression.put("input", "$senses");
    filterExpression.put("as", "sense");
    filterExpression.put("cond", new BasicDBObject("$and", Arrays.<Object>asList(
            new BasicDBObject("$eq", Arrays.<Object>asList("$$sense.lang", "EN")),
            new BasicDBObject("$eq", Arrays.<Object>asList("$$sense.source", "EN_DICTIONARY")))
    ));
    
    BasicDBObject projectionFilter = new BasicDBObject("$filter", filterExpression);
    
    AggregateIterable<Document> documents = collection.aggregate(Arrays.asList(
            new Document("$match", filter),
            new Document("$project", new Document("senses", projectionFilter))));
    
    for (Document document : documents) {
        logger.info("{}", document.toJson());
    }
    

    The resulting output is:

    2017-10-01 17:15:39 [main] INFO  c.s.mongo.MongoClientTest - { "_id" : { "$oid" : "59d10cdfc26584cd8b7a0d3b" }, "senses" : [{ "word" : "hello", "lang" : "EN", "source" : "EN_DICTIONARY" }, { "word" : "bye", "lang" : "EN", "source" : "EN_DICTIONARY" }] }
    

    Update 1: following this comment:

    After a long period of testing, trying to understand why the query was slow, I noticed that the "$match" parameter does not work, the query should select only records that have at least one sense with source = Y AND lang = X and project them , but the query also returns me documents with senses = []

    This filter: new Document("senses.lang", "EN").append("senses.source", "EN_DICTIONARY") will not match documents which have no senses attribute nor will it match documents which have an empty senses attribute. To verify this I added the following documents to my own collection:

    {
        "_id" : ObjectId("59d72a24c26584cd8b7b70a5"),
        "ID" : "yyyyyyyy"
    }
    
    {
        "_id" : ObjectId("59d72a3ac26584cd8b7b70ae"),
        "ID" : "zzzzzzzzz",
        "senses" : []
    }
    

    And re ran the above code and I still get the desired result.

    I suspect your statment that the above code does not work is either a false negative or the documents you are querying are different to the sample I have been working with.

    To help you diagnose this issue for yourself you could ...

    • Play around with other operators e.g. the $match stage behaves the same with and without an $exists operator:

      new Document("senses", new BasicDBObject("$exists", true))
              .append("senses.lang", new BasicDBObject("$eq", "EN"))
              .append("senses.source", new BasicDBObject("$eq", "EN_DICTIONARY"))
      
    • Remove the $project stage to see exactly what the $match stage produces.