Search code examples
phpphp-stream-wrappers

What is a bucket brigade?


I would really love to implement a php_user_filter::filter(). But therefore I have to know what a bucket brigade is. This seems to be a resource which I can operate with the stream_bucket_* functions. But the documentation is not really helpful. The best I could find are those examples in stream_filter_register().

I'm especially curios what these stream_bucket_new() and stream_bucket_make_writeable() can do.


Update: It seems that PHP is exposing an internal data structure of Apache.


Solution

  • Ah, welcome to the least documented parts of the PHP manual! [I opened a bug report about it; maybe this answer will be helpful for documenting it: https://bugs.php.net/bug.php?id=69966]

    The bucket brigade

    To start with your initial question, the bucket brigade is just a name to the resource named userfilter.bucket brigade.

    You are passed two different brigades in as first and second parameters to php_user_filter::filter(). The first brigade is the input buckets you read from, the second brigade is initially empty; you write to it.

    Regarding your update about the data structure… It's really just a doubly linked list with strings basically. But it may well be that the name was stolen from there ;-)

    stream_bucket_prepend() / stream_bucket_append()

    stream_bucket_prepend(resource $brigade, stdClass $bucket): null
    stream_bucket_append(resource $brigade, stdClass $bucket): null
    

    The expected $brigade is the output brigade aka the second parameter on php_user_filter::filter().

    The $bucket is a stdClass object like it is returned by stream_bucket_make_writable() or stream_bucket_new().

    These two functions just prepend or append the passed bucket to the brigade.

    stream_bucket_new()

    To demystify this function, analyze first what it's function signature is:

    stream_bucket_new(resource $stream, string $buffer): stdClass
    

    First argument is the $stream you're writing this bucket to. Second is the $buffer this new bucket will contain.

    [I'd like to note here that the $stream parameter actually is not very significant; it's just used to check whether we need to allocate memory persistently so that it survives through requests. I just suppose that you can make PHP nicely segfault by passing a persistent stream in here, when operating on a non-persistent filter...]

    There is now an userfilter.bucket resource created which is assigned to a property of a (stdClass) object named bucket. That object has also two other properties: data and datalen, which contain the buffer and the buffer size of this bucket.

    It will return you a stdClass which you can pass in to stream_bucket_prepend() and stream_bucket_append().

    stream_bucket_make_writable()

    stream_bucket_make_writeable(resource $brigade): stdClass|null
    

    It shifts the first bucket from the $brigade and returns it. If the $brigade was emptied, it returns null.

    Further notes

    When php_user_filter::filter() is called, the $stream property on the object filter() is called on will be set to the stream we're currently working on. That's also the stream you need to pass to stream_bucket_new() when calling it. (The $stream property will be unset again after the call. You can't reuse it in e.g. php_user_filter::onClose()).

    Also note that even when you're returned a $datalen property, you do not need to set that property in case you change $data property before passing it to stream_bucket_prepend() or stream_bucket_append().

    The implementation requires you (well, it expects that or will throw a warning) that you read all the data from the $in bucket before returning.

    There is another case of the documentation lying to us: in php_user_filter::onCreate(), the $stream property is not set. It will only be set during filter() method call.

    Generally, don't use filters with non-blocking streams. I tried that once and it went horribly wrong … And it's not likely that's ever going to be fixed...

    Sum up (examples)

    Let's start with the simplest case: writing back what we got in.

    class simple_filter extends php_user_filter {
        function filter($in, $out, &$consumed, $closing) {
            while ($bucket = stream_bucket_make_writeable($in)) {
                $consumed += $bucket->datalen;
                stream_bucket_append($out, $bucket);
            }
            return PSFS_PASS_ON;
        }
    }
    
    stream_filter_register("simple", "simple_filter")
    

    All what happens here is getting buckets from $in bucket brigade and putting it back into $out bucket brigade.

    Okay, now try to manipulate our input.

    class reverse_filter extends php_user_filter {
        function filter($in, $out, &$consumed, $closing) {
            while ($bucket = stream_bucket_make_writeable($in)) {
                $consumed += $bucket->datalen;
                $bucket->data = strrev($bucket->data);
                stream_bucket_prepend($out, $bucket);
            }
            return PSFS_PASS_ON;
        }
    }
    
    stream_filter_register("reverse", "reverse_filter")
    

    Now we registered the reverse:// protocol, which reverses your string (each write is being reversed on it's own here; write order is still preserved). So, we obviously now need to manipulate the bucket data and prepend it here.

    Now, what's the use case for stream_bucket_new()? Usually you can just append to $bucket->data; yes, you even can concatenate all the data into the first bucket, but when flush()'ing it might be possible that nothing is in bucket brigade and you want to send a last bucket, then you need it.

    class append_filter extends php_user_filter {
        public $stream;
    
        function filter($in, $out, &$consumed, $closing) {
            while ($bucket = stream_bucket_make_writeable($in)) {
                $consumed += $bucket->datalen;
                stream_bucket_append($out, $bucket);
            }
            // always append a terminating \n
            if ($closing) {
                $bucket = stream_bucket_new($this->stream, "\n");
                stream_bucket_append($out, $bucket);
            }
            return PSFS_PASS_ON;
        }
    }
    
    stream_filter_register("append", "append_filter")
    

    With that (and the existing documentation about php_user_filter class), one should be able to do all sorts of magic userland stream filtering by combining all these powerful possibilities into even stronger code.