The Ruby programming language released version 3.1 back in . Among the changes was a big update to Psych version 4.0, Ruby’s built-in YAML Ain’t a Markup Language (YAML, a recursive acronym) interpreter. A major version change indicates incompatible changes, and version 4 sure does deliver on that promise.
It broke everything from modules in the Ruby standard library and major frameworks like Ruby on Rails, to Nanoc: the static-site generator powering this blog. Let’s dive in and explore the changes!
For over a decade, Psych’s
load_file() method have (in practice) been aliases for the
unsafe_load_file() methods. In version 4, these methods have changed to their
safe_ prefixed equivalents by default.
A safe load is better than an unsafe one, right? Yeah, it’s probably a good call to change the default. However, it has also broken a large number of Ruby packages.
Developers use the
load_file() method because they intuit its name and it works. (Intuitive method names are one of the Ruby language’s core strengths.) If developers had been aware of the
safe_load_file() methods, they’d probably use them instead.
So, what’s the difference between the safe and unsafe methods? The former has a stricter syntax parser and won’t serialize as many data types as the latter does by default. The stricter parser is a better choice when loading untrusted and potentially malicious YAML files. (Interpreters are always at risk of misinterpreting unexpected data.)
Your use of YAML may be unaffected, or your program might fail after upgrading to Psych 4. Here are some highlights of the differences between
unsafe_load() in Psych 4:
safe_load()disallows YAML aliases (a potentially recursive data structure) by default (override with the
safe_load()only deserializes a handful of default classes (below; extend the list with the
Here‘s the default list of serializable classes when running in safe mode:
That should have your basics covered, right? Well, you might not have noticed, but the class list is missing the
Time classes. Many programs also expect Psych to serialize
Almost everything I’ve ever used YAML for in Ruby includes a date or time object. If you do try to load a YAML file containing a date, you’ll now get the following error message:
This is the same error you’d get with earlier versions if you explicitly used the
safe_load() methods without allowing the classes. You can extend the list of allowed classes by adding a
permitted_classes: [Date] argument. It’s not a big deal to add the argument once, but let’s just say I’ve had to add it to a lot of places after the update.
To further complicate matters, Psych 4 dropped support for legacy positional arguments and now requires the use of named arguments. If you used to rely on
safe_load(yaml, [Date]), you now need to migrate to
safe_load(yaml, permitted_classes: [Date]).
The change away from positional arguments had been noted in the Psych module documentation for some years already. However, the software itself never issued deprecation warnings.
I believe the changes in and of themselves are good. The changes came from a good place and nothing but good intentions. However, the execution and introduction of incompatible changes were poorly handled.
Ideally, Psych should have printed warnings to standard error output (STDERR) to notify developers of the upcoming changes. Developers should have been given a heads-up warning and time to migrate. Instead, developers probably first noticed the changes when their programs stopped working.