Can anyone explain the deletion of Left-Lean-Red-Black tree clearly?
There is an interesting comment about the Sedgwich implementation and in particular its delete method from a Harvard Comp Sci professor. Left-Leaning Red-Black Trees Considered Harmful was written in 2013 (the Sedgwich pdf you referenced above is dated 2008):
Tricky writing
Sedgewick’s paper is tricky. As of 2013, the insert section presents 2–3–4 trees as the default and describes 2–3 trees as a variant. The delete implementation, however, only works for 2–3 trees. If you implement the default variant of insert and the only variant of delete, your tree won’t work. The text doesn’t highlight the switch from 2–3–4 to 2–3: not kind.
The most recent version I could find of the Sedgwich code, which contains a 2-3 implementation, is dated April 2014. It is on his Algorithms book site at RedBlackBST.java
Ok I am going to try this, and maybe the other good people of SO can help out. You know how one way of thinking of red nodes is as indicators of
- where there there imbalance/new nodes in the tree, and
- how much imbalance there is.
This is why all new nodes are red. When the nodes (locally) balance out, they undergo a color flip, and the redness is passed up to the parent, and now the parent may look imbalanced relative to its sibling.
As an illustration, consider a situation where you are adding nodes from larger to smaller. You start with node Z which is now root and is black. You add node Y, which is red and is a left child of Z. You add a red X as a child of Z, but now you have two successive reds, so you rotate right, recolor, and you have a balanced, all black (no imbalance/"new nodes"!) tree rooted at Y [first drawing]. Now you add W and V, in that order. At first they are both red [second drawing], but immediately V/X/W are rotated right, and color flipped, so that only X is red [third drawing]. This is important: X being red indicates that left subtree of Y is unbalanced by 2 nodes, or, in other words, there are two new nodes in the left subtree. So the height of the red links is the count of new, potentially unbalanced nodes: there are 2^height of new nodes in the red subtree.
Note how when adding nodes, the redness is always passed up: in color flip, two red children become black (=locally balanced) while coloring their parent red. Essentially what the deletion does, is reverse this process. Just like a new node is red, we always also want to delete a red node. If the node isn't red, then we want to make it red first. This can be done by a color flip (incidentally, this is why color flip in the code on page 3 is actually color-neutral). So if the child we want to delete is black, we can make it red by color-flipping its parent. Now the child is guaranteed to be red.
The next problem to deal with is the fact that when we start the deletion we don't know if the target node to be deleted is red or not. One strategy would be to find out first. However, according to my reading of your first reference, the strategy chosen there is to ensure that the deleted node can be made red, by "pushing" a red node down in front of the search node as we are searching down the tree for the node to be deleted. This may create unnecessary red nodes that fixUp()
procedure will resolve on the way back up the tree. fixUp()
presumably maintains the usual LLRBT invariants: "no successive red nodes" and "no right red nodes."
Not sure if that helps, or if we need to get into more detailed examination of code.