I was working on a project that required iterating over a directory of files recently. Whenever I do this I reach for my old friend RecursiveDirectoryIterator
. In this case I only needed the full path to each file. I was using the fact that RecursiveDirectoryIterator
returns the full path as the key on each iteration step by default. I was ignoring the value, which is an SplFileInfo
object. Looking through the documentation I saw there was a way to make RecursiveDirectoryIterator
return just the full path as the value. However I ran into a bug that caused me to dig into the internals of the SPL and figure out how things were working.
Besides RecursiveDirectoryIterator
there are two other main SPL classes for iterating over files in a directory: DirectoryIterator
, and FilesystemIterator
. When SPL was first introduced in PHP 5.0 FilesystemIterator
did not exist. It was added in PHP 5.3. I never understood what the difference between it and DirectoryIterator
was until now. The documentation does not give many clues. How this relates to RecursiveDirectoryIterator
is that class used to extend from DirectoryIterator
. When FilesystemIterator
was introduced it was changed to extend from that class instead. The reasons for this will be clear later. So what are the differences between DirectoryIterator
and FilesystemIterator
? Here is what I found out.
DirectoryIterator
When you iterate using DirectoryIterator
each value returned is that same DirectoryIterator
object. The internal state is changed so that when you call isDir()
, getPathname()
, or similar methods the correct information is returned. If you were to ask for a key when iterating you will get an integer index value.
<?php
$files = new DirectoryIterator(/*...*/);
foreach ($files as $index => $iterator) {
/*...*/
}
FilesystemIterator
FilesystemIterator
(and thus RecursiveDirectoryIterator
) on the other hand returns a new, different SplFileInfo
object for each iteration step. The key is the full pathname of the file.
<?php
$files = new FilesystemIterator(/*...*/);
foreach ($files as $fullPath => $info) {
/*...*/
}
This is by default. You can change what is returned for the key or value using the flags argument to the constructor. Your choices are:
CURRENT_AS_PATHNAME
CURRENT_AS_FILEINFO
CURRENT_AS_SELF
Note that this makesFilesystemIterator
andRecursiveDirectoryIterator
behave likeDirectoryIterator
KEY_AS_PATHNAME
KEY_AS_FILENAME
The bug I ran into has to do with the CURRENT_AS_PATHNAME
option. Using it will cause PHP to throw a fatal exception. I made a pull request to fix this and submitted it via Github but as of the date of this blog post has yet to be merged.
I'm not sure about all of the histroy, why DirectoryIterator
returned itself but RecursiveDirectoryIterator
returned new SplFileInfo
objects when SPL was created, but it is clear the FilesystemIterator
class was introduced to make the API a little bit cleaner.