Search code examples
moduleoperator-overloadingd

How to dismember structure data and operators?


I want to build algebraic system, so I need a carrier, which is basically some data type, and a bunch of operators over that type. It is natural for algebras to differ in signature meaning the same type might have different set of operators with the same notation.

Say I have a vector type. Normally I would use euclidean metric and norm for it, so I import vector, euclidean, where vector contains data declaration for vector type, but all the overloaded operators for the same vector go to euclidean. Then when I want to work with riemanian space I simply import vector, riemanian and get a completely different algebra with the same interface.

I know, this can be achieved in object paradigm via inheritance, but maybe it is possible to do that with plain modules? All I need is to declare data in one module and operators in other all for the same structure.


Solution

  • Two possibilities come to mind. One is using UFCS, defining named functions (it won't work for the operator overloads) in other modules that take the type as the first parameter, then are callable with dot syntax (forgive me if I mess up the math here):

    module myvector;
    struct vector {
         float x;
     float y;
    }
    
    module myvectormath;
    import myvector;
    vector add(vector lhs, vector rhs) {
         // inside, it is just a regular function
         vector result;
     result.x = lhs.x + rhs.x;
     result.y = lhs.y + rhs.y;
     return result;
    }
    

    usage:

    import myvector;
    import myvectormath;
    
    // but it can be called with dot notation
    vector a = vector(0,0).add(vector(5, 5));
    

    Another possible way is to put the data in a struct or a mixin template, then make the math by putting that in another struct with the needed functions:

    // data definition
    module myvector;
    
    // the data will be an external named type, so we can pass it on more easily - will help interop
    struct VectorData {
       float x;
       float y;
    }
    
    // and this provides the stuff to get our other types started
    mixin template vector_payload() {
    // constructors for easy initialization
    this(float x, float y) {
            _data.x = x;
        _data.y = y;
    }
    this(VectorData d) {
            _data = d;
    }
    
        // storing our data
    VectorData _data;
    
    // alias this is a feature that provides a bit of controlled implicit casting..
    alias _data this;
    }
    
    // math module #1
    module myvectormath;
    import myvector;
    
    struct vector {
        // mixin all the stuff from above, so we get those ctors, the data, etc.
        mixin vector_payload!();
    
    // and add our methods, including full operator overloading
        vector opBinary(string op:"+")(vector rhs) {
            vector result;
            result.x = this.x + rhs.x;
            result.y = this.y + rhs.y;
            return result;
        }
    }
    
    // math module #2
    module myvectormath2;
    import myvector;
    
    struct vector {
        // again, mix it in
        mixin vector_payload!();
    
    // and add our methods
        vector opBinary(string op:"+")(vector rhs) {
            vector result;
        // this one has horribly broken math lol
            result.x = this.x - rhs.x;
            result.y = this.y - rhs.y;
            return result;
        }
    }
    
    // usage
    import myvectormath;
    // OR
    //import myvectormath2;
    void main() {
        vector a = vector(0, 0) + vector(5, 5);
        import std.stdio;
        writeln(a);
    }
    

    In the usage module, if you just replace imports, the rest of the code remains unmodified. What happens though if you want to use both modules at once and intermix them? That's where the inner struct _Data, the constructor taking it, and alias this magic comes in. First, we'll import both and see what happens:

    test32.d(23): Error: myvectormath.vector at test324.d(4) conflicts with myvectormath2.vector at test322.d(4)

    So, first, we want to disambiguate the name. There's all kinds of ways to do this, you can learn more in the import section of the D docs: http://dlang.org/module.html#Import

    For now, I'm going to just use the fully qualified name.

    // usage
    import myvectormath;
    import myvectormath2;
    void main() {
        // specify the kind we want to use here...
        myvectormath.vector a = myvectormath.vector(0, 0) + myvectormath.vector(5, 5);
        import std.stdio;
        writeln(a); // and we get a result of 0, 5, so it used the addition version correctly
    }
    

    How can we easily move them around internally? Let's make a function that uses version #2:

    void somethingWithMath2(myvectormath2.vector vec) {
    // whatever
    }
    

    It will complain if you pass the variable "a" to it because it is myvectormath.vector, and this is myvectormath2.

    test32.d(27): Error: function test32.somethingWithMath2 (vector a) is not callable using argument types (vector)

    But, we can pretty easily convert them thanks to the external data struct, the ctor, and alias this in the mixin template:

        somethingWithMath2(myvectormath2.vector(a));
    

    Compiles! The way that works under the hood is myvectormath2.vector has two constructors: (float, float) and (VectorData). Neither of them match the type of a, so next it tries a's alias this... which is VectorData. So it implicitly converts and then matches the VectorData ctor.

    You could also just pass the data around:

    import myvector;
    void somethingWithMath2(VectorData a_in) {
    // to do math on it, we construct the kind of vectormath we're interested in:
    auto a = myvectormath2.vector(a_in);
    // and use it
    }
    

    And then call it this way:

    // will implicitly convert any of the sub vectormath types to the base data so this just works
    somethingWithMath2(a);
    

    Passing around the data would probably be most nice, since then the caller doesn't need to know what kind of stuff you'll be doing with it.

    The constructor it uses here is trivial by the way, and shouldn't incur significant runtime loss (possibly none at all if the compiler switch is set to inline it; this is basically just a reinterpret_cast; the data representation is identical).

    Note that it will not let you add myvectormath2.vector + myvectormath.vector, that will be a type mismatch. But if you do want to allow that, all you have to do is change the overloaded operator to accept VectorData instead of one of the math types! Then it will implicitly convert and you have the same data to work on. Think of VectorData as being a base class in OOP terms.

    I think that covers the bases, let me know if you have any further questions.