P2P apps’ connection amnesia makes them less fault-tolerant

Peer-to-peer (P2P) applications discover peer devices either using a centralized tracking server (e.g. Syncthing, Dat, BitTorrent) or a “server-less” Distributed Hash Table (DHT). Server-less solutions like DHT are never truly serverless, though. I’ve previously discussed how DHT clients are overly reliant on centralized bootstrapping/introduction servers and how that acts as a single-point-of-failure.

In that earlier article, I argued that DHT-clients should cache their DHT peers between sessions. Those peers aren’t guaranteed to be online the next time your client tries to join the network. However, P2P clients typically make hundreds of connections to the DHT swarm.

This method would reduce clients’ reliance on bootstrapping servers and help maintain the network in the event of an intentional or unintentional service outage. As a bonus, it could speed up the process of rejoining the DHT swarm.

P2P clients that frequently communicate with the same peers in a topic swarm — e.g. for social networks, chat, and file syncing apps — are also needlessly dependent on the peer discovery systems. E.g. your laptop shouldn’t need to rediscover how to connect to your home-PC every time it wants to sync folders between them over Syncthing. P2P apps can simply cache the discovered peer connection information for peers it frequently communicates with.

Don’t get me wrong: peer discovery systems and their bootstrapping servers are needed to facilitate the initial introductions and connections across the internet. However, there’s no need for P2P clients to rediscover the same known peers every time you start the app.

The cached connection information can go stale over time as devices roam between networks. Most operating systems also rotate their IPv6 addresses daily for privacy reasons. It’s enough that one of the peers that want to communicate with another knows the other peer’s current connection details. As a bonus, caching peer connection information can speed up peer reconnects between sessions.

I’ve looked into this aspect of a dozen P2P apps that regularly communicate with the same peers; including Syncthing, Jami (formerly GNU Ring), Resilio Sync (formerly BitTorrent Sync), Dat, Technitium Mesh, RetroShare, git helpers, and others. Only Resilio Sync and RetroShare cache peer connection information between sessions. I’ve raised a proposal for Syncthing to implement this too.

I find it particularly noteworthy that Resilio Sync has implemented peer connection caching. Resilio Sync was formerly developed under the name BitTorrent Sync by BitTorrent Inc before it was spun out as a separate company. It has developed P2P software since and their engineers should know a thing or two about operating peer discovery services and building redundancy into distributed networks.

Out of the above list, Resilio Sync is also the only proprietary client. It has a paid subscription service. The software must work to the greatest extent possible, even during a service outage, or the company will lose paying customers. Caching peer connection details is a simple way to make P2P apps more robust. At least for clients that frequently communicate with the same peers.

Developers of P2P applications like file sync, chat, and gossip networks already know that each peer is likely to communicate with the same peers from session to session. It’s a no-brainer to decide to implement peer connection caching. However, these use cases also exist in more general-purpose P2P applications like Dat, Hypercore, IPFS, and BitTorrent. Apps can analyze connection patterns and make decisions to cache certain peers that the client frequently exchanges data with.