What are the implications of R + W > N for Cassandra clusters?

The basic problem we are trying to solve is this:

Can a situation occur in which a read doesn't return the most up-to-date value?

Obviously, this is best avoided if possible!

If R+W <= N, then this situation can occur.

A write could send a new value to one group of nodes, while a subsequent read could read from a completely separate group of nodes, and thus miss the new value written.

If R+W > N, then this situation is guaranteed not to occur.

There are N nodes that might hold the value. A write contacts at least W nodes - place a "write" sticker on each of these. A subsequent read contacts at least R nodes - place a "read" sticker on each of these. There are R+W stickers but only N nodes, so at least one node must have both stickers. That is, at least one node participates in both the read and the write, so is able to return the latest write to the read operation.

R+W >> N is impossible.

The maximum number of nodes that you can read from, or write to, is N (the replication factor, by definition). So the most we can have is R = N and W = N, i.e. R+W = 2N. This corresponds to reading and writing at ConsistencyLevel ALL. That is, you just write to all the nodes and read from all the nodes, nothing fancy happens.


Quorum write and Quorum read allow to detect stale values in a leaderless replication system.

For example, we have 3 replicators A, B, C (N=3). C is down during a user update. The update is accepted on both A and B (Write = 2).

When the user reads the value, C comes back. It's possible to read a stale value in C. In order to detect the stale value, the user will also read from B (Read = 2).

When the user received updates from B and C, a version number can be used to determine which value is newer(B has a newer version number).

In this scenario, where Write = 2, Read = 2, N = 3, R + W > 3, we are certain that any stale value can be detected.

For R + W = 3, it's possible to have written in A and B, but only read from C. In this case, we can't detect the stale value.