Search code examples
entity-framework-coreazure-cosmosdbazure-cosmosdb-sqlapi

ToListAysnc/ToList taking too long with Azure Cosmos Entity Framework Core


I am using Microsoft.EntityFrameworkCore.Cosmos to fetch results fom Azure CosmosDB. Here is the code.

public async Task<IEnumerable<BookingEntity>> GetAll(string partnerId)
{
    var result = await _context.Bookings.Where(x => x.PartnerId == partnerId).ToListAsync();

    return result;
}

PROBLEM: When await _context.Bookings.Where(x => x.PartnerId == partnerId).ToListAsync() is executed, it takes around 5 seconds to fetch the results. It is same even if I use ToList(). Is there any improvement I can do here to quickly fetch the results?


Solution

  • First of all, you should keep in mind that the Cosmos DB Emulator doesn't support all features and optimizations of actual cloud hosted instances. Depending on how you've configured your emulator, you may very well be running into RU limits which will limit your throughput when requesting a large number of records as documented here.

    Keeping that in mind, there are a few things that you could/should work on to improve your performance:

    1. Do you really need all of the entities that your requesting, or could you, for instance, limit the number returned by using paging? Current versions of EF Core allow you to page your results using the typical:
    var result = await _context.Bookings
       .Where(x => x.PartnerId == partnerId)
          .OrderBy(a => a.DateCreated)
          .Skip(x)
          .Take(y)
       .ToListAsync();
    

    Meanwhile, EF Core 9 (still in preview), incorporates a far more efficient paging mechanism:

    CosmosPage firstPage = await context.Sessions
        .OrderBy(s => s.Id)
        .ToPageAsync(pageSize: 10, continuationToken: null);
    
    string continuationToken = firstPage.ContinuationToken;
    foreach (var session in firstPage.Values)
    {
        // Display/send the sessions to the user
    }
    
    1. Do you really need to return the full BookingEntity object, or could you return a stripped down version with only the fields that you actually need? This would, most likely, reduce the amount of information being sent over the network greatly, and, more importantly, would greatly reduce the load from deserialization of the objects.

    2. If you're only consulting the data and don't have to work on the records, you should also consider adding .AsNoTracking() to your query, to avoid the overhead introduced by EF's change tracking.