Search code examples
c#visitor-pattern

C# - Can't use double-dispatch Visitor for array of derived generic types


I wanted to know why I can't get running any Visitor of generic objects in C#.

For some reason the program always picks the least concrete generic overload of my generic.

P.S.: I'm not after suggestions of alternative approaches (like not using a visitor).

I suspect the problem is I'm using generics instead of non-generic visitable types for my visitor, but don't get why (as overloads of generics are supposed to work, specially given that the particular instance of concrete type TimedValue, where T is int or something else for example, is the one that dispatches the call to the visitor).

I have a performance-sensitive problem where I have to handle several possible types of objects that derive from a common hierarchy (e.g. ITimedValue) in many places according to the incoming object type (T in TimedValue<T>), but I don't want to use switches or if-elses all over the code (design choice) and I also don't want to use dynamic (for performance concerns).

I start by tagging objects under that hierarchy by adding a timestamp, and then lots of processing follows, and these objects are passed around to many different places, including reactive streams etc.

I thought a Visitor pattern with double-dispatch should handle this. However, generics are playing a trick on me and whatever I do I can't get them to call my specific overloads.

Stripped down code to exemplify the issue

Here is the problematic code (very stripped down to a minimum reproducible example).

There is an interface for the base type ITimedValue, and generic derived types TimedValue<T>.

Then, an interface ITimedValueVisitor for visitors of these types TimedValue<T>, as well as a very stripped down visitor implementation DemoValueVisitor (that right now doesn't even accumulate the state) just to show the problem is all in calling the proper visitor Visit and VisitSpecific overloads.

I've made VisitSpecific use a different name from Visit to make it more obvious we should be calling the concrete type.

I've also tried adding templated and non-templated overloads to that method to handle the specific derived types like TimedValue<bool>, etc, and both have the same behaviour -- only VisitSpecific<T>(TimedValue<T> typedVariant) gets called.

The only thing that worked so far is to get rid of double dispatch and apply casts through a switch in Visit, so it calls the specific VisitSpecific right there after the casts, but this shouldn't be needed at all (and at which point this is no longer really using double-dispatch).

    public interface ITimedValue  // Base type (we'll have arrays and streams of these)
    {
        public void Accept(ITimedValueVisitor visitor);
    };

    /**
     * I really wanted to use the generic (instead of having a specific type, per each `T` I want to support).
     * I'd rather avoid non-generic types -- e.g. TimedValueBool instead of TimedValue<bool> -- though I suspect this is the source of all trouble.
     */
    public class TimedValue<T> : ITimedValue  // Concrete types passed on streams
    {
        // Some other special data common to all subtypes...

        public long Time { get; private set; }
        public T Value { get; private set; }

        public TimedValue(T value)
        {
            Value = value;
        }

        public void Accept(ITimedValueVisitor visitor)
        {
            visitor.VisitSpecific<T>(this);  // Also removed this <T>, it's the same.
        }

        public override string ToString()
        {
            return $"TimedValue<{typeof(T)}>({Time}, {Value})";
        }
    }


    public interface ITimedValueVisitor  // A visitor interface
    {
        void Visit(ITimedValue typedVariant);

        void VisitSpecific<T>(TimedValue<T> typedVariant);
        void VisitSpecific<T>(TimedValue<bool> typedVariant);
        void VisitSpecific<T>(TimedValue<string> typedVariant);

        // and many others... (these above are just to make the example simpler)

    }

    public class DemoTimedVisitor : ITimedValueVisitor  // Some dummy visitor to show the problem
    {
        private ITestOutputHelper logger;

        public DemoTimedVisitor(ITestOutputHelper logger)
        {
            this.logger = logger;
        }

        // Do I need this at all? It's so I don't get errors like "can't cast `ITimedValue` to `TimedValue<T>`".
        public void Visit(ITimedValue timedValue)
        {
            timedValue.Accept(this);
        }

        /* I'm forced to have this by the type system. However, the system only calls this
         * independently of the specific type, and if we have more specific overloads. 
         */
        public void VisitSpecific<T>(TimedValue<T> typedVariant)
        {
            logger.WriteLine("Called Visitor GENERIC overload!!!!" + typedVariant.ToString());
        }

        // I also tried without success having VisitSpecific(TimedValue<bool> typedVariant) instead. Never gets called :( 
        public void VisitSpecific<T>(TimedValue<bool> typedVariant)
        {
            logger.WriteLine("Called Visitor BOOL overload! " + typedVariant.ToString());
        }

        // Ditto.
        public void VisitSpecific<T>(TimedValue<string> typedVariant)
        {
            logger.WriteLine("Called Visitor STRING overload!" + typedVariant.ToString());
        }
    }

Some test code (with XUnit) exemplifying the issue:

using Xunit.Abstractions;

public class TestDemo
{


    readonly ITestOutputHelper logger;

    public TestDemo(ITestOutputHelper logger)
    {
        this.logger = logger;
    }

    [Fact]
    public void DemoOfTargetAPIUsage()  // Just to show roughly the point.
    {
        DemoTimedVisitor visitor = new DemoTimedVisitor(logger);  // Ideally it would be collecting data in it (but in this question it does nothing for simplicity).


        // Ideally we'd be using:
        List<ITimedValue> timedValues = new List() { new TimedValue<...>(...), ... }

        // Followed by:
        foreach (var timedValue in timedValues)
        {
            visitor.Visit(timedValue);
        }


        visitor.GetState()... // What we'd want
    }

    [Fact]
    public void EvenBasicStuffFails()  // most basic example. 
    {
        DemoTimedVisitor visitor = new DemoTimedVisitor(logger);  // Ideally it would be collecting data in it (but in this question it does nothing for simplicity).

        ITimedValue vbool = new TimedValue<bool>(true);
        ITimedValue vstring = new TimedValue<string>("foo");
        ITimedValue vfloat = new TimedValue<float>(9999.999F);

        vbool.Accept(visitor);    // uses unspecific overload :(
        vstring.Accept(visitor);  // uses unspecific overload :(
        vfloat.Accept(visitor);   // uses unspecific overload as it should (no specific visitor overloads defined) :+1:

        ...
    }

}
        

I'm not very used to C#'s generics and obviously I must be doing something really stupid, I just can't figure out why Accept always calls VisitSpecific<T>(TimedValue<T>). Is it the only type available at compile time? But inside Accept, the implementation should vary by type T -- or at least that's what I wanted :(

Do I need to get rid of generics, like TimedValue<Custom> -> TimedValueCustom or am I missing something on this double-dispatch? (I'd reeeally rather not)

Thanks!


Solution

  • visitor.VisitSpecific<T>(this);
    

    On this line, the compiler only knows that this is a TimedValue<T>, so void VisitSpecific<T>(TimedValue<T> typedVariant) is the best match that will be called in all cases. The C# compiler will hard code this method handle in the generated IL. Though the runtime could JIT a specialisation of the method to de-virtual the call site if the method is "hot" enough.

    You will need to write your own code somewhere, where the type of the argument is known to the compiler, so that the correct method can be resolved at compile time. Perhaps a switch expression; case TimedValue<bool> boolValue:.

    Or you could skip the generics and define each TimedValueBool : ITimedValue.

    Or you could use dynamic to choose the method at runtime, paying the performance cost of doing so.