Make changes to static content with response body substitutions

Ctrl blog’s syndication news feed is called upon to handle a lot of different tasks and service integrations. It’s used to share new articles on Twitter, create the weekly newsletter, as well as handling integrations with syndication services like Apple News and Flipboard. It’s also used in a myriad of different feed readers by individual readers.

For some of these integrations, I require some minor customizations to the feed to fit with specific technical requirements if each platform. A top issue was the many conflicting requirements for what image sizes to use and how to embed them in the feed.

I’ve also added tracking parameters to the feed to differentiate between readers coming to the site from some primary external sources like the email newsletter, Twitter account, feed readers, and so on. This is the same task my old WordPress Feed to Google Analytics integration plugin handled. (I don’t track individual subscribers or clients, only what channel is used to access the site.)

It was no problems handling this task with a dynamic site generator like WordPress. It could give slightly different responses to different clients based on their User-Agent or a URL query parameter. However, this poses a question when I migrated off WordPress and began using a static website generator.

The question became: how can I introduce some limited dynamic responses when all I had was statically generated files?

The answer: Dynamic responses from static content with request-specific substitutions performed by the web server on the edge. In other words, Apache HTTPD’s mod_substitute module.

The below Apache HTTPD configuration example shows how the string “#src=feed” is replaced inside the response body when the URL query string is either ?src=none, ?src=email, or ?src=apple-news. These changes are only applied for locations that end with .atom with the media-type application/atom+xml.

<LocationMatch "\.atom$">
  <If "%{QUERY_STRING} =~ /src=none/">
    AddOutputFilterByType SUBSTITUTE application/atom+xml
    Substitute "s/#src=feed//nq"
  </If>
  <ElseIf "%{QUERY_STRING} =~ /src=email/">
    AddOutputFilterByType SUBSTITUTE application/atom+xml
    Substitute "s/#src=feed/#src=email/nq"
  </ElseIf>
  <ElseIf "%{QUERY_STRING} =~ /src=apple-news/">
    AddOutputFilterByType SUBSTITUTE application/atom+xml
    Substitute "s/#src=feed/#src=apple-news/nq"
  </ElseIf>
</LocationMatch>

You may also need to enable the mod_substitute module in your configuration.

The above configuration example is repetitive, but this is necessary because the substitute module can’t access environmental variables. This limitation will be removed in the upcoming Apache HTTPD 2.5.1 release.

You may need to change HTTPD’s filter chain if you use custom filters, or your web server is acting as a reverse proxy server. The most likely problem you’ll run into as a proxy server is that you’ll need to pass the response body through the deflate filter before applying substitute, and then reapply the compression filter.

You can achieve the same thing in Nginx with the sub_filter (exact string matches) or subs_filter (regular expressions) filters. Neither of these is built and included with Nginx by default, however.