How to create a completely immutable tree hierarchy? Construction chicken and egg

Eric Lippert recently blogged about this problem. See his blog post Persistence, Facades and Roslyn's Red-Green Trees. Here's an excerpt:

We actually do the impossible by keeping two parse trees. The "green" tree is immutable, persistent, has no parent references, is built "bottom-up", and every node tracks its width but not its absolute position. When an edit happens we rebuild only the portions of the green tree that were affected by the edit, which is typically about O(log n) of the total parse nodes in the tree.

The "red" tree is an immutable facade that is built around the green tree; it is built "top-down" on demand and thrown away on every edit. It computes parent references by manufacturing them on demand as you descend through the tree from the top. It manufactures absolute positions by computing them from the widths, again, as you descend.


Two thoughts:

  1. Use some sort of tree factory. You could describe the tree using mutable structures, then have a factory that would assemble an immutable tree. Internally, the factory would have access to the fields of the different nodes and so could rewire internal pointers as necessary, but the produced tree would be immutable.

  2. Build an immutable tree wrapper around a mutable tree. That is, have the tree construction use mutable nodes, but then build a wrapper class that then provides an immutable view of the tree. This is similar to (1), but doesn't have an explicit factory.

Hope this helps!


You've correctly stated your problem as one of chicken and egg. Another way of restating the problem that might shed a solution is that you want to grow a tree (root, trunk, leaves and all - all at once).

Once you accept that the computer can only process things step by step, a series of possible solution emerges:

  1. Have a look at how Clojure creates immutable data structures. In Clojure's case, each operation on a tree (such as adding a node) returns a new tree.

  2. Make tree creation atomic. You can create a special format and then deserialize the tree. Since all serialization methods are internal, you do not have to expose any mutable methods.

  3. Just before the factory returns a constructed tree, lock it down with a flag first. This is an analog of atomic operation.

  4. Use package level methods to construct the tree. This way the mutation methods on nodes couldn't be accessed by external packages.

  5. Create nodes on the fly when they are accessed. This means your internal tree representation can never be changed, since changing the nodes have no effect on your tree structure.