What algorithm used to find the nth sorted subarray of an unordered array?

The algorithm you are looking for is Selection Algorithm, which lets you find k-th order statistics in linear time. The algorithm is quite complex, but the standard C++ library conveniently provides an implementation of it.

The algorithm for finding k-th sorted interval that the interviewers had in mind went like this:

  • Find b=(k-1)*y-th order statistics in O(N)
  • Find e=k*y-th order statistics in O(N)
  • There will be y numbers between b and e. Store them in a separate array of size y. This operation takes O(N)
  • Sort the array of size y for O(y * log2y) cost.

The overall cost is O(N+N+N+y * log2y), i.e. O(N+y * log2y)


You can combine std::nth_element and std::sort for this:

std::vector<int> vec = muchData();
// Fix those bound iterators as needed
auto lower = vec.begin() + k*y;
auto upper = lower + y;

// put right element at lower and partition vector by it
std::nth_element(vec.begin(), lower, vec.end());
// Same for upper, but don't mess up lower
std::nth_element(lower + 1, upper - 1, vec.end());
// Now sort the subarray
std::sort(lower, upper);

[lower, upper) is now the k-th sorted subarray of length y, with the desired complexity on average.

To be checked for special cases like y = 1 before real world use, but this is the general idea.

Tags:

Algorithm

C++