Every web developer should already be familiar with the ‘301 Moved Permanently’ redirect and maybe even the newer ‘308 Permanent Redirect’ status responses in HTTP. Did you know that you shouldn’t just follow the redirect to the new location, but update your own stored reference to point to the new location?
In my article Best practices for caching of syndication feeds, I mention that feed clients shouldn’t keep pulling the same old URL over and over again when encountering a permanent redirect. According to the HTTP specification, they’re supposed to direct future references to permanently redirected resources to the new location directly without first going through the redirect.
This won’t only speed up future feed updates, but also helps ensure that the feed will keep working in the future in case an old domain, website, or just the redirect goes away. This doesn’t only apply to syndication feeds, but all documents and resources on the web. A permanent redirect is a signal of a permanent change, and your content management system (CMS) or program should respond to that signal accordingly.
Developers often forget that permanent is supposed to be permanent, and from the feedback I’ve received on my best practices article, web developers are also entirely unaware of this part of the HTTP specification. So here is the relevant section from RFC 7538 (the exact same language is applied to 301 Moved Permanently in RFC 7231):
Developers of content management platforms, syndication engines, etc. are clearly not aware of or care about the permanent nature of redirects. I’m still processing thousands of syndication feed requests for addresses that haven’t been in use and that have been signaling a permanent redirect for over four years.
More redirects have been created in the last year than ever before on the web because of the mass transition from HTTP to HTTPS. Yet, an enormous amount of websites will continue pointing to the less secure HTTP variant and require clients to go an extra round back and forth between the destination website and its new and updated location.
The Broken Link Checker plugin for WordPress can help WordPress websites stay on top of redirects by regularly querying and detecting broken and redirected links. It won’t process redirected links automatically, but it will give you a list you can check once a month to make sure all your internal and external links are as direct and fast as possible. There are solutions for other publishing platforms out there, and it’s not really that hard to create an automated link-updating system.
Permanent redirect attack
The HTTP specification doesn’t address the possibility that someone could temporarily loose control over their web server and thus allow a third party to hijack web traffic by permanently redirecting it to another server. Ouch.
This can be solved by monitoring the redirect for some days before acting on it. A client could track a redirect for a week or a month, and then only update the reference if the redirect is still in place after some time has passed. This also guards against temporary configuration mistakes, aggressive captive portals, and other problem spots.
Nothing can beat human verification of a redirect to make sure that the new destination is the same as the old one. Though, you can actually automate this part to a great extent by storing information about the destination such as title, date of publication, and excerpt in a link reference database. This goes against the destination-unaware design of the web, but that doesn’t make the user any happier about broken links. There are a lot of tools available to automate extraction of link destination metadata. By comparing some locally stored information about link targets, you can detect dead websites that have turned into link-farms or have changed so significantly that you’d either want to link to an older archived version of the same page in the Internet Archive, or just remove the link altogether.
So why bother with any of this? This helps improve and maintain the quality and reputation of your website by making sure external links are fast/direct and continue to work. Link rot is a huge problem on the web; a 2013 study found that “49 percent of the hyperlinks in Supreme Court decisions no longer work,” according to the New York Times. Links go away all the time, but you can help maintain the links on your own website by updating redirects and follow the resource instead of waiting for old links to just stop working.
I spend maybe three minutes per month on updating redirects and rooting out dead links. The combination of my own Post Archival in the Internet Archive plugin and the Broken Link Checker plugin for WordPress means that most links will be available in the Internet Archive. My plugin automatically submits external links (and the links to my posts) to the Internet Archive, and the Broken Link Checker detects broken links and suggests updating redirects to their new destination or replacing dead links with archived versions. It’s a simple setup that helps ensure that the references and sources I use keeps working in the future.
You probably can’t fix the web’s link rot problem on your own, but you can contribute to keeping links working in your own tiny corner of the web with just a small amount of effort. Detect and act on the permanent redirect signal, and you’ll have happier visitors to your site that can still get to the external resources you’ve linked to in the past.