Search code examples
c#multithreadingasp.net-core-webapiparallel.foreachasp.net-core-6.0

Fetch all record from API where offset is 100 in C#


I am using C# and ASP.NET Core 6 MVC.

I have a requirement to fetch all results from API using offset whether it is just 64 records or 6300 records. Adjust the offset and do a concurrency call or parallel call to get all records at once. I need to do in the best way.

I am calling an API which results 100 max record per call. Although the overall total result (totalResult) can be around 65, 120, 1500 or 2520, or 6534 etc. There is an offset integer which I can pass into the API to get the further 100 results each time. By default, it is zero, which can brings 100 max records.

For example for totalResult of 65, the offset 0 is sufficient as it will bring all 65 records. For totalResult of 150, the offset 0 will bring 100 records and then for the next iteration, offset has to be 100 to bring more. And likewise for 6530 max records, the offset has to be adjusted 100, 200, 300... to get all results.

Now, I need to run this task parallel to avoid delay time.

This is my function:

var offset = 0

// My async call method
var addressResult = await _postcode.GetAddresses(strPostcode, offset);

if (addressResult?.Results != null && addressResult.Results.Any())
{
        // concurrency code to run here with offset
        int total = addressResult.Header.TotalResults; //Total Result e,g 6500
        var thePostcoderesult = addressResult.Results;

        // max result could be any number depends on the Total Result if it is 
        int maxresult =  thePostcoderesult.Count(); 
}

So in the end when all concurrency calls to API finishes, thePostcoderesult should have all results added to it.

var thePostcoderesult = addressResult.Results;

Now, I am aware we can achieve this through

await Parallel.ForEachAsync(offsets, options, async (offset, ct) =>

with the help of the post ticked answer How to make multiple API calls faster?

I tried implementing that logic - but it gives me result only up to 1000 results as something to do with Offset and Parallel loop is not aligned. As tasks are running 10 times only and it gives 1000 results - although the results with the postcode I am searching is 1630.

Here is my updated code but as I mentioned, it does not wait to finish or run until the total number of offset.

var offset = 0
var addressResult = await _postcode.GetAddresses(strPostcode, offset);

if (addressResult?.Results != null && addressResult.Results.Any())
{
    int total = addressResult.Header.TotalResults;

    // Setting offset here - but something is not right
    IEnumerable<int> offsets = Enumerable
            .Range(0, total)
            .Select(n => checked(n * 100))
            .TakeWhile(offset => offset < Volatile.Read(ref total));

    // wanted to use 10 parallel threads which is a safe bet I believe
    var options = new ParallelOptions() { MaxDegreeOfParallelism = 10 };
 
    var thePostcoderesult = new List<AddressResult>();
    await Parallel.ForEachAsync(offsets, options, async (offset, ct) =>
        {
            var addressResult = await _postcode.GetAddresses(strPostcode, offset);

            if (offset == 0)
            {   //I am not using it
                //Volatile.Write(ref total, Jresult.Results.Count());
            }
            thePostcoderesult.AddRange(addressResult.Results);
        });

    return thePostcoderesult;
}

Apologies in advance for the detailed post - If you can help to do this more correct or neat way, please you are welcome

Many thanks


Solution

  • You got a lot going on there, I don't think it needs to be quite that complicated. Since it seems the initial GetAddresses call tells you how many records you're going to have, you can do something like this:

    var initialResponse = await _postcode.GetAddresses(strPostcode, 0);
    
    if (initialResponse?.Results == null || !initialResponse.Results.Any())
    {
      return;
    }
    
    var totalPostCodeResults = new AddressResult[initialResponse.Header.TotalResults];
    
    // fill up to the first 100 since you have it and bail if that's all there is
    FillItems(initialResponse.Results, totalPostCodeResults, 0);
    
    if(totalPostCodeResults.Length <= 100)
      return totalPostCodeResults;
    
    // Fill the offsets (aka start indexes) starting at 100
    var offsets = new List<int>();
    var offset = 100;
    while(offset < totalPostCodeResults.Length)
    {
      offsets.Add(offset);
      offset+=100;
    }
    
    // TODO: add the last one using modulus
    
    // Kick off a task for each offset range
    var tasks = new Task[offsets.Count()];
    for(int i = 0; i < tasks.Length; i++)
    {
      // copy i to scoped variable to avoid parallel messiness
      var index = i;
      tasks[index] = Task.Run(async () => {
        var response = await _postcode.GetAddresses(strPostcode, offsets[index]);
        FillItems(response.Results, totalPostCodeResults, offsets[index]);
      }
    }
    
    // Wait for all of them to finish
    Task.WaitAll(tasks);
    
    return totalPostCodeResults
    
    void FillItems(List<AddressResult> results, AddressResult[] totalArray, int startIndex)
    {
      var index = startIndex;
      results.ForEach(item => totalArray[index++] = item);
    }