Explain why this binary tree traversal algorithm has O(NlogN) time complexity?
If we encounter an unbalanced node, we get an early return of false so this is the optimal case. The "worst case" for this algorithm to handle is a completely balanced tree, since we get no early returns of false. For the sake of this example, let's use a perfect binary tree with n nodes.
The first call would trigger getHeight() on each node so ~n nodes are visited. Total work for root level is O(n).
The next two calls (root.left.isBalanced() and root.right.isBalanced()) would trigger getHeight() on subsequent nodes but each one only calls it on ~1/2 n nodes. Total work for 1 height is also O(n).
The next 4 calls would call getHeight on n/4 nodes each. So total work for 2 height is also O(n).
If you see the pattern, the total work for each level of the tree is O(n), so total work for all levels is O(n) * levels in a perfect tree, which comes out to O(nlogn).
The getHeight definitely has a linear complexity. It just visits every element in the subtree, so it is O(k)
where k
is the number of nodes in the subtree.
Now regarding the isBalanced. First it computes the height (that is linear as we have seen earlier). But then if we are not so lucky we have to compute the isBalanced 2 more times: for the left and for the right subtrees. In the worst case we will be performing the linear computation for log N times.
You may study the Master Theorem that describes more generic cases.
In this particular case the parameters for the theorem are: a = b = 2
and there is constant overhead on dividing the problem into subproblems.
The worst case complexity of this algorithm happens in the case of balanced binary search tree since otherwise we return early. Consider the following balanced binary search tree
isBalanced
function goes through all the nodes once(including the null children of leaf nodes). For each of these nodes it calls getHeight
to calculate the height of the left and right child. So getHeight
requires work proportional to size of subtree rooted on that node.
For null children of leaves (there are 16
such nodes) it requires constant amount of work. For the leaf nodes (1, 3, 5, 7...)
we need double the work but our node is reduced by half (i.e. we have 8
nodes). One level above we need four times the work but our node is halved again.
In general if we have N
nodes thus total amount of work is roughly
N + N/2*2 + N/4*4 + ... + N/N * 1
Every term of the sum is equal to N
. How many terms are there? That's just the height of tree i.e. lg(N)
since we reduce N
by 2
until it reaches 1
. So total complexity is O(N*lg(N))