How can we calculate, for every element in an array, the number of elements to the right that are greater than that element?
Quick summary of the problem statement: Given an array A
which contains N
integers, construct an array X
such that for every i
, X[i] =
the number of elements in A
that have an index greater than i
and are also greater than A[i]
.
One way to solve this problem would be to use a binary search tree. Start by iterating from the last to the first element, adding each element to the set as we iterate. Every time we are at an element e
, use the binary search tree's find()
operation to find how many elements are greater than e
in the current tree.
Perhaps your first thought would be to use a std::multiset
(not std::set
because we may have duplicate elements!), which is a self-balancing binary search tree that offers O(logN)
insertion and O(logN)
element finding. This seems like it would work for this algorithm, but it actually wouldn't. The reason is because when you call std::multiset::find()
, it returns an iterator to the element in the set. Finding how many elements in the set are actually greater than the element would take O(N)
time, as to find the distance from the iterator to the end of the set would require incrementing it repeatedly.
To solve this problem, we use an "indexed multiset", which is a slightly modified binary search tree such that we can find the index of an element in the multiset in O(logN)
time while still supporting O(logN)
insertion. Here's my code demonstrating this data structure:
#include <iostream>
#include <vector>
#include <ext/pb_ds/assoc_container.hpp>
using namespace std;
using namespace __gnu_pbds;
// I know this is kind of messy, but it's the general way to get a C++ indexed
// multiset without using an external library
typedef tree <int, null_type, less_equal <int>, rb_tree_tag,
tree_order_statistics_node_update> indexed_set;
int main()
{
int A_size;
cin >> A_size;
vector <int> A(A_size);
for(int i = 0; i < A_size; ++i){
cin >> A[i];
}
// Input Done
indexed_set nums;
vector <int> X(A_size);
for(int i = A_size - 1; i >= 0; --i){
// order_of_key returns the first index that A[i] would be at in a sorted list
// with the same elements as nums.
X[i] = nums.size() - nums.order_of_key(A[i]);
nums.insert(A[i]);
}
for(int item : X){
cout << item << " ";
}
cout << "\n";
return 0;
}
So, overall, the general strategy would be to
- Iterate from the last element to the first element.
- For every element, check in
nums
to see how many elements are greater than the current element. (O(logN)
) - Then, insert the current element and continue to iterate. (
O(logN)
) Clearly, the total time complexity of this algorithm isO(NlogN)
and the space complexity isO(N)
.
A quick summary of the observations and insights of this method:
INSIGHT: If we iterate from the last to the first element (not the first to the last), the indexed-set will only contain elements to the right of the current element at any given iteration, which is exactly what we want. This saves us time because we don't need to worry about inserting all the elements at the beginning then removing them one by one if we were to iterate from left to right.
OBSERVATION: A
std::set
wouldn't suffice for the binary search tree in this algorithm because although it providesO(logN)
finding an element, calculating the elements position in the set requires a worst case ofO(N)
time. An indexed-set, however, provides this "position-finding" operation inO(logN)
time, as well as insertion.
Telescope has first mentioned (in the comments) that you can used a Binary tree to achieve that. However, you can do it also with the following alternative approach:
- Use an AVL tree;
- Each node should store the element and the number of elements on its right sub-tree;
- Iterate the array from the end to the beginning;
- add to the tree and update the size on the nodes accordingly.
- While adding compare the current element against the root; If this element is greater then the root than it is greater than all the elements to the sub-tree. In this case take the size from the node and add to the corresponded position on the array X;
- If it is not greater then the root then processed to the appropriated sub-tree. And apply the aforementioned logic.
The time complexity will be N times inserting into the tree. Hence, O(n log(n))
. And the space complexity will be naturally O(N)
.