How ‘Do Not Track’ affects your visit to Ctrl blog

I’ve written up a new privacy policy for this site as required by local laws. There is nothing too original or interesting in it, however you may be curious about how I’ve implemented support for the controversial “Do Not Track” (DNT) header.

Visits to this website are tracked using Google Analytics. Mostly because I like keeping an eye on where my visitors are from (their referring website or app, and country) and I want to know how many are reading each article i write. Today 70 % of the world’s top websites and 9 % of the entire web uses Google Analytics.[1]

Update (): The Privacy Policy have changed since this was published The DNT header still has a similar effect. Please review the new version for current information applicable to this website.

People — including myself — are not always comfortable with the all watching eye of Google looking over their shoulders all the time. Given the type of content I publish, I don’t believe anyone would really mind if Google knows about it. 76 % of total visitors (94 % of searchers) do actually come to the site via Google Search, so people can’t be too worried about Google knowing about their visit to this website. I’ve previously used the self-hosted Piwik platform, but it affected performance badly and frankly didn’t do a good job at neither tracking nor letting me analyze the recorded data in aggregate.

The Do Not Track signal is a standard HTTP header available in all web browsers. Enabling it means visitors don’t want to be tracked all over the web. Or something. It’s not really defined what it means nor what visitors expect the setting to do. At any rate, most websites completely disregard it and neither advertisers nor trackers seem to know what to do about it. Since Google Analytics is not doing anything based on the header on it’s own, I thought I’d at least modify how Google Analytics behaves. A normal Google Analytics cookie is stored on your device for two-years by default. You’ll find Google Analytics cookies in your browser set for sites you haven’t visited in 20 months!

What I’ve done for when the DNT header is detected is to modify the lifetime of the Google Analytics cookie from a persistent two-year cookie to a session cookie. Session cookies are normally deleted when you restart your browser and as such can’t be used track individual devices over time. The modified JavaScript used to achieve this is shown below:

var dnt = 'doNotTrack' in window.navigator ?
   window.navigator.doNotTrack : 'none';
ga('create', 'UA-XXXXXXXX-X',
   (dnt === '1') ? {'cookieExpires': 0} : 'auto');

// The above replaces the below call to Google Analytics. 
// ga('create', 'UA-XXXXXXXX-X', 'auto');

I’ve also opted to always enable the “anonymizeIp” option in Google Analytics. I’ve no idea how this option is supposed to work, as the full IP address is always transmitted to Google as a part of fetching the script that then tell Google to only store your partial IP.

You also get tracking free URLs in the Atom syndication feeds if your feed reader requests the feeds with the DNT header set. Normally, a feed reader would get the feeds with URLs modified to include an extra bit of information saying link was opened from a feed reader. This extra information is stripped off the link when requested with a DNT-transmitting feed reader. As far as I’ve been able to figure out, no feed readers offer the DNT header as an option, nor does any other feed serving server vary their responses based on whether the DNT header is present.[2]

If you’re thinking “Hold on, why doesn’t he just turn off the third-party tracking altogether when Do Not Track is detected?” I’ve got a fairly good answer for you: This is my website. I write the content and I make the rules. As mentioned above, I’ve tried doing the right thing and using a self-hosted first-party tracking solution called Piwik. The problem is that it just isn’t any good. It slowed down the site enormously and it wasn’t even good at tracking. If another alternative pops up, I’ll be sure to check it out. In the meantime, I’m kind of stuck with using what is available. Server access logs can’t be used anymore because every web browser preloads or even pre-renders one or more pages in addition to the one you’ve actually visited. From the server logs, there is no way to tell what is a real page load and what is just the browser optimistically loading the next page regardless of whether the user will ever see the page or not. It takes a fair bit of JavaScript to determine if a page load is real or just pre-rendering and it would take me years to write a competent tracking solution to rival Google Analytics or even Piwik.

Oh, and I track the usage of the DNT header itself through Google Analytics using the below code. Mostly because I find doing so just too damned ironic.

if (dnt) ga('send','event', 'DNT', dnt);
  • [1] Google Analytics usage numbers from BuiltWith.
  • [2] Based on querying the HTTP Archive for any occurrences of “DNT” in the “Vary” headers. For a poorly defined privacy feature that spun up a big controversy, only three really weird websites ever seem to actually use it. Now four, if you count mine.