Search code examples
phpspl

RecursiveFilterIterator re-instantiated within RecursiveIteratorIterator?


The standard way to recursively scan directories via SPL iterators is:

$files = new RecursiveIteratorIterator(
    new RecursiveDirectoryIterator($path),
    RecursiveIteratorIterator::CHILD_FIRST
);

foreach ($files as $file) {
    print $file->getPathname() . PHP_EOL;
}

I want a composable set of filters to apply to my recursive file search. I'm using a RecursiveDirectoryIterator to scan a directory structure.

I want to apply more than one filter to my directory structure.

My set up code:

$filters = new FilterRuleset(
    new RecursiveDirectoryIterator($path)
);
$filters->addFilter(new FilterLapsedDirs);
$filters->addFilter(new IncludeExtension('wav'));
$files = new RecursiveIteratorIterator(
    $filters, RecursiveIteratorIterator::CHILD_FIRST
);

I thought I could apply N filters by using rule set:

class FilterRuleset extends RecursiveFilterIterator {
    private $filters = array();

    public function addFilter($filter) {
        $this->filters[] = $filter;
    }

    public function accept() {
        $file = $this->current();

        foreach ($this->filters as $filter) {
            if (!$filter->accept($file)) {
                return false;
            }
        }

        return true;
    }
}

The filtering I set up is not working as intended. When I check the filters in FilterRuleset they are populated on the first call, then blank on subsequent calls. Its as if internally RecursiveIteratorIterator is re-instantiating my FilterRuleset.

    public function accept() {
        print_r($this->filters);
        $file = $this->current();

        foreach ($this->filters as $filter) {
            if (!$filter->accept($file)) {
                return false;
            }
        }

        return true;
    }

Output:

Array
(
    [0] => FilterLapsedDirs Object
        (
        )

    [1] => IncludeExtension Object
        (
            [ext:private] => wav
        )
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)

I'm using PHP 5.1.6 but have tested it on 5.4.14 and there's no difference. Any ideas?


Solution

  • When I check the filters in FilterRuleset they are populated on the first call, then blank on subsequent calls. Its as if internally RecursiveIteratorIterator is re-instantiating my FilterRuleset.

    Yes, this is exactly the case. Each time you go into a subdirectory the array is empty because per recursive iterator rules, the recursive filter iterator needs to provide the child iterators.

    So you've got two options here:

    1. Apply filters on the flattened iteration, that is after tree-traversal. It looks feasible in your case as long as you only need to filter each individual file - not children.
    2. The standard way: Take care that getChildren() returns a configured FilterRuleset recursive-filter-iterator object with the filters set.

    I start with the second one because it's quickly done and the normal way to do this.

    You overwrite the getChildren() parent method by adding it to your class. Then you take the result of the parent (which is the new FilterRuleset for the children and set the private member. This is possible in PHP (in case you wonder that his works because it's a private member) because it's on the same level of the class hierarchy. You then just return it and done:

    class FilterRuleset extends RecursiveFilterIterator
    {
        private $filters = array();
    
        ...
    
        public function getChildren() {
            $children = parent::getChildren();
            $children->filters = $this->filters;
            return $children;
        }
    }
    

    The other (first) variant is that you basically "degrade" it to the "flat" filter, that is a standard FilterIterator. Therefore you first do the recursive iteration with a RecursiveIteratorIterator and then you wrap that into your filter-iterator. As the tree has been already traversed by the earlier iterator, all this recursive stuff is not needed any longer.

    So first of all turn it into a the FilterIterator:

    class FilterRuleset extends FilterIterator
    {
       ...
    }
    

    The only change is from what you extend with that class. And the you instantiate in a slightly different order:

    $path  = __DIR__;
    $files = new RecursiveIteratorIterator(
        new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS),
        RecursiveIteratorIterator::CHILD_FIRST
    );
    
    $filtered = new FilterRuleset($files);
    $filtered->addFilter(Accept::byCallback(function () {
        return true;
    }));
    
    foreach ($filtered as $file) {
        echo $file->getPathname(), PHP_EOL;
    }
    

    I hope these examples are clear. If you play around with these and you run into a problem (or even if not), feedback always welcome.

    Ah and before I forget it: Here is the mock I've used to create filters in my example above:

    class Accept
    {
        private $callback;
    
        public function __construct($callback) {
            $this->callback = $callback;
        }
    
        public function accept($subject) {
            return call_user_func($this->callback, $subject);
        }
    
        public static function byCallback($callback) {
            return new self($callback);
        }
    }