Search code examples
c#garbage-collectionenumerate

How to write an IAsyncEnumerable function so that cleanup code always executes


My IAsyncEnumerable<T> function must run its cleanup code regardless of whether the enumerator it returns is disposed of correctly. In the example below, I've used the task of returning a byte array to the array pool as an example of mandatory cleanup code. Each test ceases use of the enumerator before it has completely finished. All but the last of the tests correctly dispose the enumerator via the foreach utility, but the last test deliberately does not dispose of the enumerator. Instead, it allows the enumerator to pass out of scope, and then triggers garbage collection in an attempt to see if the system itself can trigger the last of the cleanup code.

If the cleanup code runs correctly for each test scenario, the expected output would be:

Starting 'Cancellation token'.
Finalizing 'Cancellation token'.
Starting 'Exception'.
Finalizing 'Exception'.
Starting 'Break'.
Finalizing 'Break'.
Starting 'Forget to dispose'.
Finalizing 'Forget to dispose'.

However, I have never been able to create a test in which the final expected output line appears.

Here is the test code:

using System;
using System.Buffers;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
using System.Threading.Tasks;

// -- Cancellation token test --
var cts = new CancellationTokenSource();
await foreach (var (index, bytes) in GetDataPackets("Cancellation token", cts.Token)) {
    if (index == 2)
        cts.Cancel();
}

// -- Thrown exception test --
try {
    await foreach (var (index, bytes) in GetDataPackets("Exception")) {
        if (index == 2)
            throw new Exception("Boom");
    }
} catch { }

// -- With Break test --
await foreach (var (index, bytes) in GetDataPackets("Break")) {
    if (index == 2)
        break;
}

// -- Forget to dispose test --
// Create variables and forget them in another "no inlining" method
// to make sure they have gone out of scope and are available for garbage collection.
await ForgetToDispose();
GC.Collect();
GC.WaitForPendingFinalizers();

[MethodImpl(MethodImplOptions.NoInlining)]
async Task ForgetToDispose() {
    var enumerable = GetDataPackets("Forget to dispose");
    var enumerator = enumerable.GetAsyncEnumerator();
    await enumerator.MoveNextAsync();
    while (enumerator.Current.Index != 2)
        await enumerator.MoveNextAsync();
}

static async IAsyncEnumerable<(int Index, Memory<byte> Bytes)> GetDataPackets(string jobName, [EnumeratorCancellation] CancellationToken cancellationToken = default) {
    Console.WriteLine($"Starting '{jobName}'.");
    var rand = new Random();
    var buffer = ArrayPool<byte>.Shared.Rent(512);
    try {
        for (var i = 0; i < 10; i++) {
            try {
                await Task.Delay(10, cancellationToken);
            } catch (OperationCanceledException) {
                yield break;
            }
            rand.NextBytes(buffer);
            yield return (i, new Memory<byte>(buffer));
        }
    } finally {
        Console.WriteLine($"Finalizing '{jobName}'.");
        ArrayPool<byte>.Shared.Return(buffer);
    }
}

Persisting, I then tried a few ideas that might help. None of them have:

Idea 1: Add a using statement and a disposable struct that performs the cleanup work: Fail.

static async IAsyncEnumerable<(int Index, Memory<byte> Bytes)> GetDataPackets(string jobName, [EnumeratorCancellation] CancellationToken cancellationToken = default) {
    Console.WriteLine($"Starting '{jobName}'.");
    var rand = new Random();
    var buffer = ArrayPool<byte>.Shared.Rent(512);
    using var disposer = new Disposer(() => {
        Console.WriteLine($"Finalizing '{jobName}'.");
        ArrayPool<byte>.Shared.Return(buffer);
    });
    for (var i = 0; i < 10; i++) {
        try {
            await Task.Delay(10, cancellationToken);
        } catch (OperationCanceledException) {
            yield break;
        }
        rand.NextBytes(buffer);
        yield return (i, new Memory<byte>(buffer));
    }
}

readonly struct Disposer : IDisposable {
    readonly Action _disposeAction;
    public Disposer(Action disposeAction)
        => _disposeAction = disposeAction;
    public void Dispose() {
        _disposeAction();
    }
}

Idea 2: Convert the Disposer struct to a class with a finalizer method in the hope that its finalizer might get triggered: Also fail.

class Disposer : IDisposable {
    readonly Action _disposeAction;
    public Disposer(Action disposeAction)
        => _disposeAction = disposeAction;
    public void Dispose() {
        _disposeAction();
        GC.SuppressFinalize(this);
    }
    ~Disposer() {
        _disposeAction();
    }
}

Aside from writing my own enumerator class from scratch, How can I make this compiler-generated enumerator always run its cleanup code, even in the finalizer thread when it has not been correctly disposed?


Solution

  • My IAsyncEnumerable function must run its cleanup code regardless of whether the enumerator it returns is disposed of correctly.

    Full stop. You can't have any type run managed cleanup code regardless of whether it is disposed correctly. This is simply not possible in .NET.

    You can write a finalizer, but finalizers cannot have arbitrary code. Usually they are restricted to accessing value type members and doing some p/Invoke-style calls. They cannot, say, return a buffer to an array pool. More generally, with few exceptions, they cannot call any managed code at all. They're really only intended to clean up unmanaged resources.

    So this doesn't have anything to do with asynchronous enumerators. You can't guarantee cleanup code will be run if the object isn't disposed, and this is the case for any kind of object.

    The best solution is to run the cleanup code when it is disposed (e.g., in a finally block in an async enumerator function). Any code that doesn't dispose that is responsible for creating the resource leak. This is the way all other .NET code works, and this is the way async enumerators work, too.