Contradiction in Lamport's Paxos made simple paper
You missed something in step 7. When C processes accept 100:b
it sets its state to C(100:b,100)
. By accepting a value the node is also promising to not accept earlier values.
Update. I've been thinking about this all month because I knew the above answer was not absolutely correct.
What's more I looked through several proprietary and open-source paxos implementations and they all had the bug submitted by the OP!
So here's the correct answer, when viewed entirely from Paxos Made Simple:
If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals. (emphasis mine)
In other words, the proposer can only send Accept
messages to acceptors that it has received Promises
from for that ballot number.
So, is this a contradiction in Lamport's paper? Right now, I'm saying yes.
If you look at Lamport's paxos proofs he treats an accept
as a promise
, just as my original answer suggests. But this is not pointed out in Paxos Made Simple. In fact, it appears Lamport took great pains to specify that an accept
was not a promise
.
The problem is when you combine the weaker portions of both variants; as the OP did and several implementations do. Then you run into this catastrophic bug.
There is certainly no problem with broadcasting the accept request to all acceptors. You don't need to restrict it to just the ones that replied to the original prepare request. You've found a rare case of bad wording in Dr Lamport's writing.
There is, however, a bug in your counterexample. Firstly, the notation is defined like this:
X(n:v,m)
to denote the status of acceptor X: proposaln:v
is the largest numbered proposal accepted by X
But then in step 7 node C has state C(100:b,-)
, and then in step 9 it's changed to state C(1:a,-)
. This is not a valid transition: after accepting 1:a
it should remain in state C(100:b,-)
since 100:b
is still the largest numbered proposal accepted by C.
Note that it's perfectly fine that it accepts 1:a
after 100:b
, essentially because the network is asynchronous so all messages can be delayed or reordered without breaking anything, so the rest of the world can't tell which proposal was accepted first anyway.
NECROBUMP. Even with the weaker portion of both variants, there is no inconsistency. Let's look at step 9 in the example in the question:
"The state is A(-:-,100) B(100:b,100) C(1:a,-). The chosen proposal is abandon, Paxos fails"
However, at this point all we have is an indeterminate value, since there is no majority accepted value (we must eventually choose 'b' since b was accepted by a majority during step 6.)
In order to continue the protocol, we need new ballots and eventually some newer ballot will be accepted. That ballot must have the value 'b',
PROOF: C will respond with (100, 'b') on any prepare requests since the highest-numbered ballot it accepted is (100, 'b') even if it last accepted a ballot (1, 'a'). B will also respond with (100, 'b'). Hence it is no longer possible to get a majority ballot with any value but 'b'.
Lamport's language is that an acceptor will respond with "The proposal with the highest number less than n that it has accepted, if any"
The accepted answer confuses "highest numbered" with "latest accepted," since the example shows that an acceptor may accept values in decreasing numbered order. In order to completely align with Lamport's protocol, it is necessary for C to remember that it responded to (100, 'b') even if the "latest" accept it has made is (1, 'a').
(That being said I would not be surprised if many implementations don't do this correctly, and hence are vulnerable to this issue.)