Don’t rely on mod_negotiation to serve pre-compressed resources

The Apache HTTP Server (httpd) can handle server-driven negotiation for a request for static files and make an informed selection from several different file variants using special file extension patterns (such as .gz) using mod_negotiation. However, this module is unsuited to handle content negotiation for pre-compressed resources.

Quick primer on HTTP content negotiation: HTTP clients can send request headers to indicate their language, data format, and data encoding (compression) preferences. The server can use these headers as hints and return varying results that meets the client’s stated preferences. This process is known as content negotiation, or sometimes incorrectly referred to conditional requests (a term that refers to cache revalidation).

The mod_negotiation implements the experimental (and abandoned) RFC 2295 Transparent Content Negotiation in HTTP specification on top of the more widely deployed RFC 2068: Hypertext Transfer Protocol 1.1 specification. This introduces some unique problems as problems as RFC 2068 just said that server-driven content negotiation should select the “best” variant supported by the client; whereas RFC 2295 tried to be more specific than that but it ultimately failed at providing a specification. mod_negotiation have left this unresolved for almost two decades.

So what is the best variant in terms of a compressed resources? That is quite easy to answer: the one with the smallest possible file size without adding an undue decompression burden on the client. Today’s mobile networks are way slower than the processing speeds of even minimal phones and clients. Many people also pay for the amount of data they consume or must stay beneath a monthly data cap. We can therefor assume that smaller file sizes are preferred.

— and given a number of pre-compressed variants of a resource (say Brotli, Zstandard, and gzip) how does mod_negotiate select the best encoding variant? It doesn’t. It only checks that the client announces support for the encoding and otherwise doesn’t use it for calculating the best variant. Given three otherwise identical variants, you’d end up with the first variant to be evaluated by mod_negotiation (whichever that may be) every time regardless of the file size.

You can create what httpd calls a type-map for each resource and manually assign scores to each variants depending on their file size. These type-maps requires extra work to pre-calculate and produce, and you have to update them along with the resources over time.

The httpd documentation suggests using mod_rewrite (and echoed by countless blogs and forums). This approach checks for the string “gzip” in the Accept-Encoding header, and then checks if a pre-compressed variant of the requested URL exists on the file system, and finally returning either the uncompressed or pre-compressed variants accordingly. This way you can also make some assumptions, like assuming Brotli will always produce smaller files than gzip (which may not be the case depending on your files.)

You should not use this approach! All the HTTP request headers used for content negotiation can optionally include a quality score, to indicate the most preferred to least preferred variant. E.g. you could say you prefer Zstandard encoded requests, but will accept Brotli but absolutely not gzip using the following header:

Accept-Encoding: gzip;q=0, zstd, br;q=0.5

The order of the accepted encodings is meaningless but the quality (q) score is not. Simply checking for the substring gzip would return a gzipped compressed response even when the client specifically says it doesn’t support it. You may thing of this as a silly example, but the below example header can sometimes observed from non-gzip capable bots and libraries:

Accept-Encoding: gzip;q=0, identity

mod_negotiation has a few other annoying restrictions and limitations such as bug #60619 which will annoy you to no end if you’re trying to use it for serving pre-compressed resources. Server-driven content negotiation in itself is a very useful tool, but unfortunately mod_negotiation is not the implementation you’ll want to use for anything but negotiating languages and media types.

Unfortunately, few other HTTP servers have implemented content negotiation in any meaningful way. The specifications has been around for two decades and it’s easy to implement and can be really useful. If you want to implement it yourself, see RFC 7231 section 5.3 and do mind the Vary header.

It’s easy to implement on the application layer and you’ll find a descent number of libraries supporting it natively.