Understand Distributed Consensus: Raft and Paxos
Understand Distributed Consensus: Raft and Paxos
kairenner-gh/slates
Last update 2 w. agoCreated on the 23rd of March 2026

The Hardest Problem in Distributed Systems

A cluster of servers needs to agree on a sequence of operations — a replicated log — so that all nodes apply the same operations in the same order and arrive at the same state. This must work even when nodes crash, restart, and reconnect, and when network partitions isolate some nodes from others. This is the distributed consensus problem. It sits at the foundation of etcd, ZooKeeper, CockroachDB, and every other system that promises consistent replication across failures.

FLP Impossibility: The Fundamental Limit

In 1985, Fischer, Lynch, and Paterson proved that in an asynchronous network where even a single process might fail (crash silently), no deterministic consensus algorithm can guarantee both safety (never deciding incorrectly) and liveness (eventually deci

The CAP Theorem in Practice

The CAP theorem states that a distributed system can guarantee at most two of three properties: Consistency (every read sees the most recent write), Availability (every request receives a response), and Partition tolerance (the system continues to operate

Raft: Consensus Designed for Understandability

Raft divides consensus into three subproblems: leader election, log replication, and safety. At any time, exactly one node is the leader; all others are followers. Followers that do not receive a heartbeat from the leader within a randomized timeout decla

Minimumcluster size to tolerate 1 failure

Minimumcluster size to tolerate 2 failures

Paxos: The Original Protocol

Paxos, developed by Leslie Lamport, is the theoretical foundation from which most consensus algorithms derive. A Paxos round has two phases. In the Prepare phase, a proposer sends a Prepare(n) message to a quorum of acceptors, asking them to promise not t

Go Deeper: Byzantine Fault Tolerance

Raft and Paxos assume that nodes may crash and stop responding, but that responding nodes always tell the truth. In adversarial environments — blockchains, multi-organization systems, edge infrastructure — a faulty node might send deliberately incorrect or contradictory messages. Handling this worst-case adversary requires Byzantine fault-tolerant consensus, a harder problem with a fundamentally different algorithm and a stricter node count requirement.