Search code examples
mongodbmongodb-java

MongoDB Optional Partial Unique Index in Nested Arrays


I am trying to create a solution for a unique partial index on token field in a nested array apps.tokens, such that the nested array tokens is optional or can be empty.

I create the index as:

collection.createIndex(
        Indexes.ascending("apps.tokens.token"),
        new IndexOptions()
                .unique(true)
                .partialFilterExpression(
                        Filters.type("apps.tokens.token", BsonType.STRING)
                )
);

The value of the field apps.tokens.token is never explicitly null and will always be some unique string. I am currently not worried about duplicates within the same document.

However, I can't get the partial index to behave the way I would expect. It is mostly working as intended, except for situations when there is an item in the apps array with an empty or missing tokens array.

Creating the following structure fails with error E11000 duplicate key error collection: db1.testCollection index: apps.tokens.token_1 dup key: { apps.tokens.token: null } :

[
    {
        "apps": [
            {
                "client_id": "capp1",
                "tokens": [
                    {
                        "token": "t1",
                        "expiration": "2020-09-10T23:31:17.119+01:00"
                    }
                ]
            },
            {
                "client_id": "capp2"
            }
        ],
        "uuid": "89337f58-a491-4e17-b8dd-726c9319dcaa"
    },
    {
        "apps": [
            {
                "client_id": "capp3",
                "tokens": [
                    {
                        "token": "t2",
                        "expiration": "2020-09-10T23:31:17.119+01:00"
                    }
                ]
            },
            {
                "client_id": "capp4"
            }
        ],
        "uuid": "4ccc4d81-990f-4650-b26e-1d26fd22d91a"
    }
]

However, this structure is perfectly valid according to the same index:

[
    {
        "apps": [
            {
                "client_id": "capp1"
            },
            {
                "client_id": "capp2"
            }
        ],
        "uuid": "89337f58-a491-4e17-b8dd-726c9319dcaa"
    },
    {
        "apps": [
            {
                "client_id": "capp3"
            },
            {
                "client_id": "capp4"
            }
        ],
        "uuid": "4ccc4d81-990f-4650-b26e-1d26fd22d91a"
    }
]

My guess is that the first test case fails, because, having the first item inserted, the index checks that it has a apps.token.token field that is a String and adds this whole document to the insert/update comparison.

On the other hand, the second test case does not fail, because none of the documents match the condition of apps.tokens.token being a String.

As it looks at the second item to be inserted, it somehow deduces that it has a apps.token.token field that is implicitly null (because there is no tokens array in one of the apps items), then it checks whether the existing item matches {"apps.tokens.token": null} and indeed it does, and ends the operation in a failure.

What am I doing wrong?

I have tried to create the partial index with exists filter too, but it does not help.

Filters.and(
        Filters.type("apps.tokens.token", BsonType.STRING),
        Filters.exists("apps.tokens.token"),
        Filters.exists("apps.tokens")
)

Is it possible to supplement the filter with some sort of function that will handle cases when tokens does not exist or is empty for each apps item in a document?


Solution

  • Looks like the solution may be to use a sparse index, although the official documentation states:

    Partial indexes offer a superset of the functionality of sparse indexes. If you are using MongoDB 3.2 or later, partial indexes should be preferred over sparse indexes.

    My tests are passing with:

    collection.createIndex(
            Indexes.ascending("apps.tokens.token"),
            new IndexOptions()
                    .unique(true)
                    .sparse(true)
    );
    

    I wonder if this has any other implications that are not currently obvious to me.

    For the fullness of the solution, note that the index addresses uniqueness across documents. However, it won't check for uniqueness within the same document, so it would be possible to add a token that already exists in one of the apps in the same document. To work around this issue, I add a filter to the update query, such that a document that already has a token I'm trying to add is not included in the documents that would be updated:

    Document doc = Document.parse("{\"token\":\"t1\"}");
    collection.updateOne(
            Filters.and(
                    Filters.eq("uuid", "89337f58-a491-4e17-b8dd-726c9319dcaa"),
                    Filters.not(Filters.eq("apps.tokens.token", "t1"))
            ),
    
            Updates.push("apps.$[app].tokens", doc),
            new UpdateOptions().arrayFilters(Arrays.asList(
                    Filters.eq("app.client_id", "capp1")
            ))
    );