Search code examples
c#mongodbmongodb-.net-driver

C# MongoDb : Upsert matching on a field


So I have a list of objects, some will exist in Mongo and some will not.

  • For the ones that exist, I want to update 1 field
  • For the ones that do not exist insert the full page object.

I want to find them via their url. Is there anyway to do this?

var webapges = new List<WriteModel<Page>>();
var filterDefinition = Builders<Page>.Filter.Eq(p => p.url, **Object.url**);
var updateDefinition = Builders<Page>.Update.Set(p => p.pop, p.pop + **Object.pop??**);
listWrites.Add(new UpdateOneModel<Page>(filterDefinition, updateDefinition));
await userCollection.BulkWriteAsync(listWrites);


public class Page
    {
        [BsonId] public ObjectId Id { get; set; }
        [BsonElement("url")] public string Url { get; set; }
        [BsonElement("level")] public int Level { get; set; }
        [BsonElement("languages")] public string Languages { get; set; }
        [BsonElement("proc")] public int Proc { get; set; }
        [BsonElement("domain")] public string Domain { get; set; }
        [BsonElement("len")] public int Len { get; set; }
        [BsonElement("html")] public string Html { get; set; }
        [BsonElement("body")] public string Body { get; set; }
        [BsonElement("title")] public string Title { get; set; }
        [BsonElement("meta")] public string Meta { get; set; }
        [BsonElement("scan_date")] public BsonDateTime ScanDate { get; set; }
        [BsonElement("pop")] public int Popularity { get; set; }
}

Solution

  • As Joe said in the comments you can make each update an Upset, this is a property on UpdateOneModel<T>, then you'll have to set each property you want to set on the insert with the $setOnInsert update operator.

    So let's start by setting up a fresh database with some data to play with:

    var client = new MongoClient();
    var database = client.GetDatabase("test");
    await client.DropDatabaseAsync(database.DatabaseNamespace.DatabaseName);
    var collection = database.GetCollection<Page>("collection1");
    
    // Create our mix of pages
    var pages = new List<Page>
    {
        new Page {Url = "https://some-url/1", Body = "body1", Popularity = 0, ScanDate = DateTime.UtcNow},
        new Page {Url = "https://some-url/2", Body = "body1", Popularity = 0, ScanDate = DateTime.UtcNow},
        new Page {Url = "https://some-url/3", Body = "body1", Popularity = 0, ScanDate = DateTime.UtcNow}
    };
    
    // Insert the middle one.
    await collection.InsertOneAsync(pages[1]);
    
    Debugger.Break();
    

    Now if we drop in to the shell and see our data so far, we'll have one page in the collection that we'll want to update.

    > use test
    switched to db test
    > show collections
    collection1
    > db.collection1.find().pretty()
    {
            "_id" : ObjectId("5e80824b0664ae4020ee68b3"),
            "url" : "https://some-url/2",
            "level" : 0,
            "languages" : null,
            "proc" : 0,
            "domain" : null,
            "len" : 0,
            "html" : null,
            "body" : "body1",
            "title" : null,
            "meta" : null,
            "scan_date" : ISODate("2020-03-29T11:11:07.700Z"),
            "pop" : 0
    }
    

    Let's now update all the popularity properties on our pages to 100 to see a change.

    // Update all popularity to 100
    pages.ForEach(x => x.Popularity = 100);
    

    We can then use a bit of LINQ to create out update models that we'll send to the batch write.

    // Create all the updates as a batch
    var updateOneModels = pages.Select(x =>
    {
        var filterDefinition = Builders<Page>.Filter.Eq(p => p.Url, x.Url);
        var updateDefinition = Builders<Page>.Update.Set(p => p.Popularity, x.Popularity)
            .SetOnInsert(p => p.Level, x.Level)
            .SetOnInsert(p => p.Languages, x.Languages)
            .SetOnInsert(p => p.Proc, x.Proc)
            .SetOnInsert(p => p.Domain, x.Domain)
            .SetOnInsert(p => p.Len, x.Len)
            .SetOnInsert(p => p.Html, x.Html)
            .SetOnInsert(p => p.Body, x.Body)
            .SetOnInsert(p => p.Title, x.Title)
            .SetOnInsert(p => p.Meta, x.Meta)
            .SetOnInsert(p => p.ScanDate, x.ScanDate);
    
        return new UpdateOneModel<Page>(filterDefinition, updateDefinition) { IsUpsert = true };
    }).ToList();
    

    Now run the batch

    // Run the batch
    await collection.BulkWriteAsync(updateOneModels);
    

    Now if we look at the data from the shell, our middle page has now been updated and everything else has been inserted

    > db.collection1.find().pretty()
    {
            "_id" : ObjectId("5e80824b0664ae4020ee68b3"),
            "url" : "https://some-url/2",
            "level" : 0,
            "languages" : null,
            "proc" : 0,
            "domain" : null,
            "len" : 0,
            "html" : null,
            "body" : "body1",
            "title" : null,
            "meta" : null,
            "scan_date" : ISODate("2020-03-29T11:11:07.700Z"),
            "pop" : 100
    }
    {
            "_id" : ObjectId("5e80825cc38a0ff23e1eb326"),
            "url" : "https://some-url/1",
            "body" : "body1",
            "domain" : null,
            "html" : null,
            "languages" : null,
            "len" : 0,
            "level" : 0,
            "meta" : null,
            "pop" : 100,
            "proc" : 0,
            "scan_date" : ISODate("2020-03-29T11:11:07.699Z"),
            "title" : null
    }
    {
            "_id" : ObjectId("5e80825cc38a0ff23e1eb327"),
            "url" : "https://some-url/3",
            "body" : "body1",
            "domain" : null,
            "html" : null,
            "languages" : null,
            "len" : 0,
            "level" : 0,
            "meta" : null,
            "pop" : 100,
            "proc" : 0,
            "scan_date" : ISODate("2020-03-29T11:11:07.700Z"),
            "title" : null
    }