Structuring WordPress’ upload directory to handle lots of files

This is a quick post about organizing WordPress’ uploads directory, wp-content/uploads, for a large number of files using a file name-derived directory structure.

It’s 2018 and having too many files in one directory can still hurt performance for a number of tools and standard operations like directory listings. It doesn’t help that the filesystem is theoretically capable of supporting 18,4 quintillion (2⁶⁴) files per directory when you can’t do anything practical with those files.

The need for responsive images (image files that adopt to visitor’s screen size) also amplifies the problem by creating lots of differently scaled copy files for each uploaded image file.

WordPress Core offers exactly one option that can help to reduce the number of files stored in its default file upload directory. The option is called “Organize my uploads into month- and year-based folders” and helps reduce the likelihood of having an enormous amount of folders by placing files into sub-directories based on the year and month the image was uploaded. You get one directory for each year, twelve months, and then everything for that month goes into one directory.

This approach is fine for most uses but has some disadvantages:

  • It doesn’t scale well if you upload hundred of images per month (× number of responsive image sizes)
  • It places a date in the uploaded file’s URL which may look out of place when you’re reusing old images for new content

There are a million different approaches this particular problem. I chose to derive a short hash from the file name of each upload, and work with its hexadecimal representation (0-9, a-f). I used the two first characters from the hash to get the first level of 256 (16²) possible directories, and the next two characters to get another 256 possible sub-directories. This gave me 65 536 (16⁴) possible directories where uploaded files could be stored that each would only contain a small number of files. A third level in this hierarchy would create 16 777 216 (16⁶) possible directories, but this was quite a bit more than I needed.

Inside each of these directories I created a directory named after the file and put the file and all its responsive image sizes to bundle them up in one place. It’s faster to delete a directory than each of the individual files, plus it keeps all the files representing the same image nicely bundled up together.

With a reasonable choice of hashing algorithm, this should distribute the files evenly over all the possible directories. Another benefit is that since the hash only depends on the file name, the same image will always end up in the same directory — even if the date ticks over into a new year while processing or uploading.

This is similar to the approach you’ll see used by disk storage-backed HTTP caching servers.

I’ve not included any example code for this specific approach in this article as I don’t believe it would be directly useful for anyone else. You can implement your own custom directory structure by hooking into the wp_handle_upload_prefilter, wp_handle_upload, and upload_dir filters that are provided by WordPress Core. Add your upload_dir filter in response to the wp_handle_upload_prefilter filter and then remove your filter in wp_handle_upload. Remember that you also must create any new sub-directories you come up with inside your code>upload_dir filter function.