Cut I/O bound Rakefile task evaluation time by 12,4%

Do you regularly run Rake on projects with thousands of FileTasks? In that case, chances are your Rake execution time is I/O bound. I’ve created two patches that can cut ⅛ off your Rakefile task evaluation/compilation time.

Rake is the task and build automation tool for the Ruby programming language. It’s distributed as a part of the Ruby Standard Library default set of tools and modules. It’s a make-like tool that incrementally rebuilds only the parts of your project that have changed. It tracks changes by querying the file system for the last modified timestamp of every source and object file in your project. Each of these queries requires an individual system call (syscall) to the operating system kernel per file.

The syscalls are inexpensive on modern fast storage devices. However, small numbers add up quickly. Even seemingly instant syscalls will negatively affect performance when they’re performed thousands of times.

Ruby always yields the Global Virtual Machine Lock (GVL) and thereby Rake’s per-task thread execution for every method that calls rb_stat() (including File.exist? and File.mtime). Yielding an I/O bound thread makes sense; it enables the processor to spend its time doing something more useful than waiting for a syscall. However, excessive context switching makes the processor waste more time than it’s saving by needlessly moving things in and out of different CPU cache levels and RAM.

Rake can’t do away with the syscalls to query your files’ modification timestamps. They’re a critical part of the functionality of the software. Of course, I’m assuming that checking the modification timestamp is less expensive than recompiling your sources on every run even when they haven’t changed.

I profiled a Rake project when I noticed that Rake was querying the file system more often per file than I expected. It accumulated to tens of thousands of unnecessary syscalls in my project. I proceeded to waste hours assuming the problem was somewhere in the task generator I used in my project.

It took me a minute to identify and fix the problem when I finally turned my attention to Rake itself. Rake queries the file system to see if each object file in the project exits. TaskFiles that don’t yet exist get built. However, for files that exist, Rake queries for their existence again before it queries again to get the file’s modification timestamp.

There’s no point in checking if the file exists again. This check is another syscall required for each file in your project. Instead, Rake should assume the file exists when checking its modification time, and catch any errors that might arise if it doesn’t. This approach also makes the query an atomic (self-contained) operation (the file might disappear between checking for its existence and its modification time).

A good night’s sleep later, I realized that I could also get rid of the other file existence check too. A single file modification time check syscall is sufficient to provide all the required information.

I’ve proposed these two changes to the Rake project. It’s two simple refactoring patches that get rid of the unnecessary syscalls. However, it might be a while before the patches are accepted upstream and become available in Rake. If you’re impatient and what to speed up Rake today, you can download and apply my patches (f8afda2b22.patch and abf5e26464.patch).

I regularly run Rake on projects with tens of thousands of files. A lot of time is spent waiting for Rake to finish evaluating which files need to be recompiled between runs. These patches cut 12,4 % off my build times when there are no changes to the project. This is tested running on a fast M.2 SSD on a performant computer configuration. You may see even more performance gains with a slower storage backend or computer. Some environments where storage-related syscalls are known to be slow — I’m looking at you, Windows Subsystem for Linux (WSL2)! — will benefit quite a lot.

Ctrl.blog

Cut I/O bound `Rakefile` task evaluation time by 12,4%