Search code examples
c#asynchronousstreamreaderasp.net-core-2.1

Process incoming FileStream asynchronously


I'm reading a file from user upload and it was working synchronously. I needed to change it in order to immediately send a "received" alert to the user, then read the file asynchronously while the user would periodically poll back to see if the read was finished.

Here is what my code looks like right now:

public FileUpload SaveFile(Stream stream)
{
        FileUpload uploadObj = //instantiate the return obj

        var task = Task.Run(async () => await ProcessFileAsync(stream));
        
        return upload;
}

public async Task ProcessFileAsync(Stream stream)
{
        StreamReader file = new StreamReader(stream);
        CsvReader csv = new CsvReader(file, CultureInfo.InvariantCulture);
        
        while (await csv.ReadAsync())
        {
           //read the file
        }
}

the issue I'm having is that by the time I call the csv.ReadAsync() method, the Stream object has been disposed. How do I access the Stream when I want the SaveFile() method to return a value to the user, but the act of returning disposes the Stream object?


Solution

  • The point here is that you're working within the constraints of ASP.NET, which abstracts away a lot of the underlying HTTP stuff.

    When you say you want to process a user-uploaded file asynchronously, you want to step out of the normal order of doing things with HTTP and ASP.NET. You see, when a client sends a request with a body (the file), the server receives the request headers and kicks off ASP.NET to tell your application code that there's a new request incoming.

    It hasn't even (fully) read the request body at this point. This is why you get a Stream to deal with the request, and not a string or a filename - the data doesn't have to be arrived at the server yet! Just the request headers, informing the web server about the request.

    If you return a response at that point, for all HTTP and ASP.NET care, you're done with the request, and you cannot continue reading its body.

    Now what you want to do, is to read the request body (the file), and process that after sending a response to the client. You can do that, but then you'll still have to read the request body - because if you return something from your action method before reading the request, the framework will think you're done with it and dispose the request stream. That's what's causing your exception.

    If you'd use a string, or model binding, or anything that involves the framework reading the request body, then yes, your code will only execute once the body has been read.

    The short-term solution that would appear to get you going, is to read the request stream into a stream that you own, not the framework:

    var myStream = new MemoryStream();
    await stream.CopyTo(myStream);
    Task.Run(async () => await ProcessFileAsync(myStream));
    

    Now you'll have read the entire request body and saved it in memory, so ASP.NET can safely dispose the request stream and send a response to the client.

    But don't do this. Starting fire-and-forget tasks from a controller is a bad idea. Keeping uploaded files in memory is a bad idea.

    What you actually should do, if you still want to do this out-of-band:

    • Save the incoming file as an actual, temporary file on your server
    • Send a response to the client with an identifier (the temporarily generated filename, for example a GUID)
    • Expose an endpoint that clients can use to request the status using said GUID
    • Have a background process continuously scan the directory for newly uploaded files and process them

    For the latter you could hosted services or third-party tools like Hangfire.