Search code examples

ADLS ConcurrentAppend giving corrupt data for 1 MB files

When I use Parallel.For loop to append 10 files of 1 MB size concurrently to Azure Data Lake Service, I only see the content of last 2 files on my Azure Data Lake file, although I see the correct data getting printed to console.

When I use a simple for loop instead of this Parallel.For, data getting appended to file is correct.

Any help ?

Parallel.For(0, 10, i =>
    path[i] = @"C:\Users\t-chkum\Desktop\InputFiles\1MB\" + (i + 1) + ".txt";

    FileStream stream = File.OpenRead(path[i]);

    stream.Read(buffer, 0, buffer.Length);

    client.ConcurrentAppend(fileName, true, buffer, 0, buffer.Length);



  • It was actually a critical section problem and can be solve using either block collections or lock :

    BlockingCollection<int> b = new BlockingCollection<int>(1);
    Parallel.For(0, 10, i =>
        path[i] = @"C:\Users\t-chkum\Desktop\InputFiles\1MB\" + (i + 1) + ".txt";
        FileStream stream = File.OpenRead(path[i]);
        stream.Read(buffer, 0, buffer.Length);
        client.ConcurrentAppend(fileName, true, buffer, 0, buffer.Length);
        Array.Clear(buffer, 0, buffer.Length);

    The above code solves the problem for me :)