Search code examples
d

What is the purpose of OutputRange and put() in D?


I need some clarification on OutputRange and its purpose. It represents streamed element output similar to sending to stdout and requires support for put() method which:

determines the capabilities of the range and the element at compile time and uses the most appropriate method to output the element.

output the element but to where and for what purpose?

import std.stdio;
import std.range;

void main() {
    int[] arr = [1, 2, 3, 4, 5];
    auto s = arr;
    writeln(s); // [1, 2, 3, 4, 5]
    s.put(100); // nothing is printed to stdout, should it?
    writeln(s); // [2, 3, 4, 5] 
} 

In the code above we call put() on a slice hence we lose 1 but where did "100" go? The MultiFile example feels a little bit contrived. A more practical use case for OutputRange would be better.

A minor thing, why put is called put? In other languages put is used for insertion or addition operations on some collection. I find it confusing.

UPDATE: It looks like we need to keep a copy of the original slice in order to prevent the element dropping.

int[] arr = [1, 2, 3, 4, 5];
auto s = arr;
s.put(100);
writeln(arr); // [100, 2, 3, 4 ,5];

I find the above very confusing, perhaps I am missing the concept behind OutputRange :(


Solution

  • First, read over the docs of put: http://dpldocs.info/experimental-docs/std.range.primitives.put.html

    Notably:

    Tip: put should not be used "UFCS-style", e.g. r.put(e). Doing this may call R.put directly, by-passing any transformation feature provided by Range.put. put(r, e) is prefered.

    So don't call s.put(x), instead call put(s, x);.

    It also talks about what happened in your update:

    put treats dynamic arrays as array slices, and will call popFront on the slice after an element has been copied.

    Be sure to save the position of the array before calling put.

    You'll also notice the docs use the word "copying" a lot. So, where does put put the item, and why?

    Where depends on what the output range is. put is a generic interface that can be implemented to do all kinds of things depending on the target object. Some might stream it to stdout, some might put it in a data buffer, some might do something entirely different.

    In the case of an array slice like you were using, the library interprets that as a fixed size buffer and its put copies data into it.

    The implementation looks something like

    copy element to buffer
    advance buffer
    update buffer's remaining space
    

    That's why you need to keep a separate reference to the slice at the beginning, otherwise it copies and advances so it looks like the thing just disappeared. Why would it do this though?

    int[32] originalBuffer;
    int[] buffer = originalBuffer[];
    put(buffer, 5);
    put(buffer, 6); // since the last one advanced the remaining space, this next call just works
    

    How much of the buffer is actually used at the end? This is another use of the advancing thing: you can subtract to figure it out:

    int[] usedBuffer = originalBuffer[0 .. $ - buffer.length];
    

    We just take everything from the original except what's left over as remaining space in the output range.

    Other ranges may keep an internal count. The documentation example at http://dpldocs.info/experimental-docs/std.range.primitives.put.html#examples shows an example with a dynamic internal buffer.

    static struct A {
        string data;
        void put(C)(C c) if (isSomeChar!C) {
            data ~= c;
        }
    }
    

    Its put method copies characters into an internal string, so it will grow as needed and then data.length tells you how big it is. (the stdlib's appender works like this btw)

    The output range interface is very minimal - all it really requires is one put function and then it doesn't specify what you do with it. Imagine if it were writing to stdout, then the length doesn't matter and there's no need to return the data to the user at all. That's also why it doesn't use the ~= append operator - it is not necessarily appending anything.


    So to recap: where does it go and why? Depends on the object! OutputRange/put are deliberate generic interfaces that are meant to just collect data and do.... something with it. It is meant to be a final destination and thus does not support chaining like other ranges.

    With the built-in slice, it copies data to it and advances the position to keep it ready to accept more data. This requires a bit more work on your end to keep track of it, but gives a lot of flexibility and efficiency for generic use. You probably will be better served with other functions though specialized for your specific need. If you want to append, try http://dpldocs.info/appender for instance.