Bitrot resistance of next-generation image formats

I’ve compared two next-generation image formats, AVIF and JPEG XL (JXL), to see which best handles a random single corrupted bit. A meaningless exercise? Possibly. But half a picture of your beloved grandma is better than no picture at all.

What happens when a single bit gets corrupted in an image file you cherish? The results can range from absolutely nothing to an imperceptible visual change to a complete loss of the image. The above hero image is somewhere in the middle of the scale; where the top half of the image is perfect, and the lower half is reduced to meaningless digital noise.

Whether due to mechanical failure or transmission interference like cosmic radiation: bitrot happens. A rotted bit, or flipped bit, is when one bit of RAM or persistent storage unintentionally flips its state between zero and one. You can only do so much to protect your system from random failure. Multiple backups and data verification is the only proven strategy to protect against it.

Traditional JPEG images (referred to as JPEG troughing the article as opposed to JXL), especially with progressive encoding, handle bitrot remarkably well. You might see a single pixel shift its color almost imperceptibly, or one of the encoding layers may shift slightly. The effects are so well understood that you can even find free software that can automatically recover corrupted JPEG photos.

However, the next-generation image formats pack data much more densely than in the legacy image formats. There’s isn’t just less redundancy, but every single bit means more to the complete image. This means the effects of bitrot produce a much greater loss of visual fidelity and decodes to more abstract results. The newer encoding techniques include predictive models that can get thrown off completely by a single bit out of place. The digital hellfire in the lower half of the above hero image is a perfect example of this.

I devised a test to determine if any of the new image formats handled this situation better than the other. I’d randomly flip 100 bits, one at a time, in 310 test images at three different resolutions (544×306, 1280×720, and 1632×918 px) in three formats (AVIF, JXL, and JPEG (control)). That’s 225 000 images in total. The encoding quality was randomized for each image to give the most varied input scenarios.

The test image was encoded and decoded using the libavif+libaom versions 0.9.1 and 3.1.2 (for AVIF), libjxl version 0.6.1 (for JXL), and libjpeg-turbo version 2.1.1 (for the JPEG reference). The two first libraries are the reference implementations for each of the formats. libjpeg-turbo is the most commonly used JPEG library. Additionally, AVIF decoding was tested using the dav1d version 0.9.1 decoder library. This isn’t to give the format an unfair advantage, but there simply isn’t a second implementation of JXL yet.

The resulting bit-flipped images was then compared to the pre-bit-flipped encodings. It’s difficult to accurately and objectively quantify information loss in images. Luckily we have a tool/metric called structural dissimilarity (DSSIM) that yields a comparable score. for comparing images. The images were scored using Kornel’s dssim tool version 3.1.0.

The DSSIM score isn’t a percentage difference, but an estimation of how much difference human perception would observe from looking at two images. To give you an idea of what the DSSIM scores mean, here are my interpretations of some DSSIM scores from 0.0 to 1.0:

0.00000000, Mathematically lossless
0.00010000, Visually lossless
0.00085000, High quality lossy encoding
0.00100000, Some noticeable corruption artifacts appear
0.01000000, Lots of lossy corruption artifacts
0.10000000, Severe lossy corruption artifacts
0.35000000, The image is recognizable if you’ve seen the original and squint
0.60000000, Unrecognizable/new abstract art
Lastly, “fail” means the image could not be decoded at all.

An image where the corruption appeared late in the image file may be perfect and may appear as a high-quality image encoding even though the bottom two rows of pixels would be completely corrupted. To be fair, humans looking at a large image might not notice it. Images that are corrupted earlier in the file would receive an abstract art-esque score.

It’s time for the results in the form of a log10–log10 graph. (Click the image to enlarge it.) Each line on both axes represents a 10× increase, two lines represent 10×10×, and so on. The differences in the graph are bigger than they might first appear.

Both AVIF (libaom) and JXL performed terribly. A full 98,92 % of AVIF (libaom) and 98,67 % of JXL images completely failed to decode. Things improved drastically with AVIF (dav1d). It only failed to decode 40,91 % of the images JPEG is the winner with a decoding failure rate of only 34,42 %.

JPEG has an evenly distributed DSSIM score varying from low to high corruption. Almost 30 % of the JPEG images fall within the visually lossless category, with AVIF (dav1d) coming in at a distant second place with only 1,12 % of the images being visually lossless.

AVIF (dav1d) sees a nearly straight diagonal decline in its quality. The hero image at the top of the article helps explain this curve. How much of the image is lost depends almost entirely on where in the image file the corruption occurred.

I believe both libaom and libjxl can be changed to output results similar to dav1d. These decoders seem to abort the decoding job rather than output a partially corrupted image. However, as I said initially, half a photo of your grandma is better than no photo at all. Human faces, in particular, tend to be at the upper half of image compositions. So, there’s a good chance you won’t completely lose your photos.

With this in mind, I can’t conclusively say there’s a clear winner of this image format comparison. Both the AVIF and JXL projects are working on progressive image encoding and decoding. Progressive decoding might help improve the bitrot corruption resistance in both formats. (Weirdly, one project has only implemented encoding, and the other has only implemented decoding. So, neither has a test-ready implementation yet!)

I didn’t include the HEIC image format in this evaluation. As a proprietary image format, it has no future on the web. The format offers no clear advantage over its two open-source rivals.

So, should you worry about bitrot? Yes and no. There’s no doubt it happens, but there’s no large-scale empirical data on how frequently it happens on modern hardware. There’s no consensus in the industry, but enterprises large and small go to great expense to prevent it from affecting their business and customers. A single flipped bit can literally turn a smile emoji into a frown.

There are a few things you can do to protect your data. Have more than one backup! Take incremental backups! Verify your backups! Buy error-correcting (ECC) RAM for your computers and servers! Migrate to Linux or FreeBSD so you can use a checksumming file system!

Legacy bitmap image formats like BMP have high resistance against bitrot. Without any compression, you get huge files but shouldn’t lose the information in more than one pixel per flipped bit.

The PNG format has compression, but as a lossless format, it still produces large image files. PNGs come with internal checksumming. This feature makes it very easy to verify and detect bitrot without the need for a checksumming file system or other external checksums.