Why is it important that a binary tree be balanced?

Imagine a tree that looks like this:

  A
   \
    B
     \
      C
       \
        D
         \
          E

This is a valid binary tree, but now most operations are O(n) instead of O(lg n).


The balance of a binary tree is governed by the property called skewness. If a tree is more skewed, then the time complexity to access an element of a the binary tree increases. Say a tree

                1
               / \
              2   3
              \    \
               7    4
                     \
                      5
                       \
                        6

The above is also a binary tree, but right skewed. It has 7 elements, so an ideal binary tree require O(log 7) = 3 lookups. But you need to go one more level deep = 4 lookups in worst case. So the skewness here is a constant 1. But consider if the tree has thousands of nodes. The skewness will be even more considerable in that case. So it is important to keep the binary tree balanced.

But again the skewness is the topic of debate as the probablity analysis of a random binary tree shows that the average depth of a random binary tree with n elements is 4.3 log n . So it is really the matter of balancing vs the skewness.

One more interesting thing, computer scientists have even found an advantage in the skewness and proposed a skewed datastructure called skew heap


To ensure log(n) search time, you need to divide the total number of down level nodes by 2 at each branch. For example, if you have a linear tree, never branching from root to the leaf node, then the search time will be linear as in a linked list.