Using PHP's RecursiveFilterIterator

I was recently working on a project where I needed to recursively get all of the files with a particular extension inside a directory. Actually I needed to find all files with a .php extension but not .html.php. Sounds like a perfect use for RecursiveDirectoryIterator right? I could do something like the following.

<?php
$files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));
foreach ($files as $path => $finfo) {
    if (substr($path, -4) != '.php' || substr($path, -9) == '.html.php') {
        continue;
    }
    // do stuff
}

But I figured I would give RecursiveFilterIterator a try. For those who don't know all of the classes mentioned are part of PHP's SPL extension. RecursiveFilterIterator is actually an abstract class, you have to extend it and implement the accept method. From that method you return a boolean where false is skip the items and true is pass it to the iteration loop.

<?php
class PHPFileIterator extends RecursiveFilterIterator
{
    public function accept()
    {
        $file = parent::current();
        $name = $file->getFilename();
        return (substr($name, -4) == '.php' && substr($name, -9) != '.html.php');
    }
}
$files = new PHPFileIterator(new RecursiveDirectoryIterator($dir));

I thought that because RecursiveFilterIterator implements OuterIterator I could just pass it to a foreach statement. However running this produced no results. Upon further inspection it the loop was hitting the first sub-directory and stopping there. Reading the user comments on the documentation page for RecursiveFilterIterator shows that you still need to wrap RecursiveFilterIterator in a RecursiveIteratorIterator. Sigh, OK.

<?php
$files = new RecursiveIteratorIterator(new PHPFileIterator(new RecursiveDirectoryIterator($dir)));

But still this did not work. It turns out it was not iterating down into the sub-directories. In the accept method I also had to return true when a directory was encountered.

<?php
class PHPFileIterator extends RecursiveFilterIterator
{
    public function accept()
    {
        $file = parent::current();
        if ($file->isDir()) {
            return true;
        }
        $name = $file->getFilename();
        return (substr($name, -4) == '.php' && substr($name, -9) != '.html.php');
    }
}
$files = new RecursiveIteratorIterator(new PHPFileIterator(new RecursiveDirectoryIterator($dir)));

OK, finally we are getting somewhere. There was one last hitch, I had to also tell the RecursiveDirectoryterator to skip dot files. This is what I ended up with.

<?php
class PHPFileIterator extends RecursiveFilterIterator
{
    public static function factory($dir)
    {
        return new RecursiveIteratorIterator(
            new PHPFileIterator(
                new RecursiveDirectoryIterator(
                    $dir,
                    FilesystemIterator::CURRENT_AS_FILEINFO | FilesystemIterator::SKIP_DOTS
                )
            )
        );
    }
    public function accept()
    {
        $file = parent::current();
        if ($file->isDir()) return true;
        $name = $file->getFilename();
        return (substr($name, -4) == '.php' && substr($name, -9) != '.html.php');
    }
}

$files = PHPFileIterator::factory($dir);

It's a shame that I need to use three objects to do this. And with the FilesystemIterator constants thrown in it's a lot of typing, thus the factory method. There are some other takeaways from this exercise.

  • The RecursiveFilterIterator will let you skip and entire branch of a tree structure by returning false from accept. Imagine creating a URL matching router using this. I could be quite powerful.
  • One of the downsides I ran into while running some tests is if you wanted to look for directories with a certain name or pattern you could not do it with RecursiveFilterIterator. You have to return true for a directory in accept otherwise the iterator won't recurse down the sub-directories.
  • I also found it a bit weird that FilesystemIterator has SKIP_DOTS enabled by default but RecursiveDirectoryIterator does not. I guess these kinds of inconsistencies are to be expected though, it is PHP.

Comments