Search code examples
c#mongodblistparallel-processingparallel.foreach

A Null value cannot be written to the root level of a BSON document


When i tried to insert a list of addresslines records say around 9,00,000 to mongodb collection in a single insertManyAsync method i received an error message A Null value cannot be written to the root level of a BSON document.

Have checked whether any null entries in list, but couldnt find any. This error seems not available when i tried search for it.

                Parallel.For(0, addresslines.Count, async index =>
                {
                    tempAddresses.Add(new TempAddress() {  AddressLine1 = addresslines["AddressLine1"], Village = addresslines["Village"] });

                });

                /* Foreach without parallel works.
                foreach (var item in pincodestrings)
                {
                    tempAddresses.Add(new TempAddress() { AddressLine1 = DateTime.Now.ToLongDateString() }); //, AddressLine2 = "sample2", Dist_City = "sample", Pincode = 1, State = "yy", Town_Taluk = "aa", Village = "vv" });

                }*/

                if (tempAddresses.Count > 0)
                {
                   await _context.QCSubmission.InsertManyAsync(tempAddresses.AsEnumerable(), null);
                }

Have tried with few records say 100 records which is working fine. What was the problem inserting bulk records in MongoDB. Do i need to check and correct in MongoDB?

UPDATE: As per comments, have replaced Parallel.Foreach with for-each, which works but for processing huge data usage of Parallel is mandatory to speeden up.


Solution

  • The problem seems to lie in the multithreaded access to a non thread-safe collection from System.Collection.Generic, which, when values are added to it simultaniously in multiple threads, may introduce missing values or null values into the collection or unlink chunks of it and other undefined behavior.

    You can use one of the thread safe collections from System.Collections.Concurrent instead, in this case probably ConcurrentBag<TempAddress>.

    Edit:

    I have made a short test to compare the performance of locking and thread safe collections in parallel vs using a normal for loop and writing to a standard generic collection. You may want to do a similar test to your data and see how it compares. I ran this on DotNet fiddle and it suggested that using a normal list may be faster.

    However the more you do other than writing into the collection in the loop the better parallelism will be.

    using System;
    using System.Diagnostics;
    
    public static class Module1
    {
        public static void Main()
        {
            System.Collections.Concurrent.ConcurrentBag<int> bag = new System.Collections.Concurrent.ConcurrentBag<int>();
            // Test Bag Parallel
            Stopwatch t = Stopwatch.StartNew();
            System.Threading.Tasks.Parallel.For(0, 500000, index =>
            {
                bag.Add(index);
            });
            t.Stop();
            Console.WriteLine("Parallel Bag test completed in " + t.ElapsedTicks.ToString());
            // Test Bag Incremental
            bag = new System.Collections.Concurrent.ConcurrentBag<int>();
            t = Stopwatch.StartNew();
            for (int index = 0; index <= 500000; index += 1)
            {
                bag.Add(index);
            }
            t.Stop();
            Console.WriteLine("Incremental Bag test completed in " + t.ElapsedTicks.ToString());
            bag = null;
            // Test List Incremental
            t = Stopwatch.StartNew();
            System.Collections.Generic.List<int> lst = new System.Collections.Generic.List<int>();
            t = Stopwatch.StartNew();
            for (int index = 0; index <= 500000; index += 1)
            {
                lst.Add(index);
            }
            t.Stop();
            Console.WriteLine("Incremental list test completed in " + t.ElapsedTicks.ToString());
        }
    }
    

    Output:

    Parallel Bag test completed in 229264
    Incremental Bag test completed in 1115224
    Incremental list test completed in 42385