CAP theorem — [Notes]
· What is CAP theorem
∘ ✅ Consistency
∘ ✅ Availability
∘ ✅ Partition Tolerance
∘ Proof — In the presence of a partition, a distributed system can be either consistent or available.
· Linearizability
· is the CAP theorem actually useful?
· the proof of the CAP theorem is extremely simple
· is there a general tradeoff between consistency and availability?
· What happens when your network gets slow?
· why this is awesome
· TODO
· References
Follow this link for all System design articles
What is CAP theorem
- CAP theorem is one of the fundamental theorems in the field of distributed systems, outlining an inherent trade-off in the design of distributed systems.
CAP theorem is one of the fundamental theorems in the field of distributed systems, outlining an inherent trade-off in the design of distributed systems.
- It states that it’s impossible for a distributed data store to simultaneously provide more than 2 of the following properties:
✅ Consistency
this means that every successful read request will receive the result of the most recent write request.
- Consistency in CAP actually means linearizability, which is a very specific (and very strong) notion of consistency.
- In particular, it has got nothing to do with the C in ACID, even though that C also stands for “consistency”.
✅ Availability
this means that every request receives a non-error response, without any guarantees on whether that reflects the most recent write request.
- It’s not sufficient for some node to be able to handle the request: any non-failing node needs to be able to handle it. Many so-called “highly available” (i.e. low downtime) systems actually do not meet this definition of availability.
- (terribly misnamed) basically means that you’re communicating over an asynchronous network that may delay or drop messages. The internet and all our data centers have this property, so you don’t really have any choice in this matter.
✅ Partition Tolerance
this means that the system can continue to operate despite an arbitrary number of messages being dropped by the network between nodes due to a network partition.
- The system continues to work despite message loss or partial failure.
- or system continues to operate even if an arbitrary number of nodes are failing
- Partition tolerance means that the cluster continues to function even if there is a “partition” (communication break) between two nodes (both nodes are up, but can’t communicate).
- Partition tolerance: The ability of two segments of the system (partitions) to proceed in the presence of a temporary communication failure in their network. With more partition tolerance, the database is more available to clients, but the accessible data is less consistent and vice versa.
Proof — In the presence of a partition, a distributed system can be either consistent or available.
Let’s attempt to schematically prove this theorem in a simplistic way. As
shown in the Figure below, let’s imagine a distributed system consisting of 2 nodes.
✔ This distributed system can act as a plain register, holding the value of a
variable, called X.
✔ Now, let’s assume that at some point there is a network
failure between the 2 nodes of the system, resulting in a network partition
between them.
✔ A user of the system is performing a write and then a read-it could also be 2 different users performing the operations.
✔ We will examine the case where each operation is processed by a different node of the system. In that case, the system has 2 options:
✔ it can either fail one of the operations (breaking the availability property) or it can process both of the operations returning a stale value from the read (breaking the consistency property).
✔ It cannot process both of the operations successfully, while also ensuring
that the read returns the latest value, which is the one written by the write
operation. The reason is that the results of the write cannot be propagated
from node A to node B due to the network partition.
Linearizability
The formal definition is not entirely straightforward, but the key idea, stated informally, is this:
If operation B started after operation A successfully completed, then operation B must see the the system in the same state as it was on completion of operation A, or a newer state.
To make this more tangible, consider an example of a system that is not linearizable. See the following diagram
This diagram shows Alice and Bob, who are in the same room, both checking their phones to see the outcome of the 2014 football world cup final. Just after the final score is announced, Alice refreshes the page, sees the winner announced, and excitedly tells Bob about it. Bob incredulously hits reload on his own phone, but his request goes to a database replica that is lagging, and so his phone shows that the game is still ongoing.
If Alice and Bob had hit reload at the same time, it wouldn’t have been surprising if they had got two different query results, because they don’t know at exactly what time their respective requests were processed by the server. However, Bob knows that he hit the reload button (initiated his query) after he heard Alice exclaim the final score, and therefore he expects his query result to be at least as recent as Alice’s. The fact that he got a stale query result is a violation of linearizability.
Knowing that Bob’s request happened strictly after Alice’s request (i.e. that they were not concurrent) depends on the fact that Bob heard about Alice’s query result through a separate communication channel (in this case, IRL audio). If Bob hadn’t heard from Alice that the game was over, he wouldn’t have known that the result of his query was stale.
If you’re building a database, you don’t know what kinds of backchannel your clients may have. Thus, if you want to provide linearizable semantics (CAP-consistency) in your database, you need to make it appear as though there is only a single copy of the data, even though there may be copies (replicas, caches) of the data in multiple places.
This is a fairly expensive guarantee to provide because it requires a lot of coordination. Even the CPU in your computer doesn’t provide linearizable access to your local RAM! On modern CPUs, you need to use explicit memory barrier instruction in order to get linearizability. And even testing whether a system provides linearizability is tricky.
is the CAP theorem actually useful?
This theorem is really important because it has helped establish this basic
the limitation that all distributed systems are imposed to. This forced designers
of distributed systems to make explicit trade-offs between availability and
consistency and engineers become aware of these properties and choose
the right system appropriately
Some distributed systems claim to be linearizable, like Consul and etcd and zookeeper. These use algorithms like Paxos or Raft. The CAP theorem is a little bit useful for reasoning about these systems — because they’re linearizable, they must not be totally available. You should expect some downtime from those systems in practice. Good! That was useful. Thanks, CAP theorem!
I have an extremely small amount of experience working with linearizable systems, and in my experience, those systems are actually not always available! So this theorem also actually really does match up with what happens in real life, which is cool.
Okay, but what about every other distributed system? For example! What if I’m using a database system that has a primary and replicates writes to a secondary, and then I read from the secondary sometimes?
This system is not linearizable. So the CAP theorem has… uh… nothing to say about this system. But replicated database setups are extremely common and extremely useful! It seems silly to just stop at “well, that’s not linearizable, I have nothing else to say”. “AP” is not a very useful description for this system since it has some interesting consistency properties, and is quite different from some other “AP” systems.
the proof of the CAP theorem is extremely simple
I thought that the CAP theorem was like this complicated deep result in distributed systems theory.
It turns out that (even though it is a somewhat useful thing!) it is a really simple result to prove. Basically:
- you have 2 computers
- suppose those two computers can’t communicate
- suppose furthermore that you want them to act consistently (if you write “hello” to Computer 1, you want the whole system to know the value is “hello”)
Now suppose you ALSO want the system to be available. This is impossible! Computer 2 can’t tell you that the current value is “hello”! There is no possible way it could know that because it can’t communicate with Computer 1!
That’s basically the whole proof. You need to formalize it a bit by defining things clearly which is why the paper is several pages instead of 2 paragraphs. (definitions are important)! But the core idea is just not complicated. For more, see this illustrated guide and the original paper, but the proof is not fundamentally more complicated than that.
To me this undermines the theorem a little bit — CAP is a useful shorthand for a useful idea, but it’s not really that profound. By contrast, the FLP impossibility theorem shows that is it impossible to build a distributed consensus algorithm that will always terminate. (so there’s some risk that an algorithm like Paxos/Raft will get stuck in an infinite loop). This seems a lot less obvious (and was an open question for a long time).
What happens when there is no partition tolerance
- even during normal operation when no network partition is present, there’s a different trade-off between latency and consistency. In order for the system to guarantee data consistency, it will have to essentially delay write operations until the data have been propagated across the system successfully, thus taking a latency hit.
- when no network partition is present, there’s a different trade-off between latency and consistency.
- In order for the system to guarantee data consistency, it will have to essentially delay write operations until the data have been propagated across the system successfully, thus taking a latency hit.
- In Single-master replication
➡synchronous replication approach would favourre consistency over latency
➡asynchronous replication would benefit from reduced latency at the cost of consistency.
There is actually an extension to the CAP theorem, called the PACELC
theorem, captured in a separate article
is there a general tradeoff between consistency and availability?
So, once we’ve learned about the CAP theorem, we might hope that in addition to the specific rule (if you’re linearizable, you can’t be available), there might also be some more general tradeoff (if you’re a little bit more consistent, then you get a little bit less available).
This feels like a totally reasonable thing to hope. The CAP theorem does not actually say anything about this hope, but maybe it’s true anyway!
What happens when your network gets slow?
My favourite thing about this critique of the CAP theorem is it PROPOSES A USEFUL ALTERNATIVE THEOREM. (the “How operation latency depends on network delay” section of the paper)
Let’s suppose your network gets SLOW. (Kleppmann defines this a little more formally but I’m gonna go with SLOW)
Well!! If you’re using a linearizable system, it means that your reads are slow, and your writes are slow. That sucks!
What if you trade-off and have a system which is a little less consistent? It turns out that if you want “sequential consistency”, you can have slow writes, but fast reads! That sounds a little like our replicated database situation, maybe! (you could imagine that writing to the primary could get slow).
Here is a table from the paper where he tabulates different consistency levels and how fast of reads/writes you can get under adverse network conditions.
In particular, this means that if your network is totally down, then writing in a linearizable system takes an infinite amount of time (so writing is actually impossible).
why this is awesome
I think this different framing (where we talk about availability in terms of network latency & speed of operations) is really cool, because:
- it actually relates better to my real life (sometimes my network is not totally down, but communication is SLOW! I would love my distributed systems theory to explain what will happen!)
- it lets me describe more systems! I feel almost motivated to learn what “sequential consistency” and “causal consistency” even mean now to see if any of the systems I actually use match that definition.
The last kind of silly reason I think this is awesome is — I spent a really long time feeling bad that I didn’t understand the CAP theorem or know how to apply it. It turns out that the CAP theorem is actually a relatively small theorem which is only useful in a limited number of cases! So it was less that I didn’t understand it and more that it’s actually just not thaaaaaaat useful. Now I don’t feel as bad! And I have some slightly better tools for thinking about distributed systems! Yay.