What's the practical difference between std::nth_element and std::sort?

It's perfectly valid for std::nth_element to sort the entire range for fulfilling the documented semantic - however, doing so will fail at meeting the required complexity (linear). The key point is that it may do so, but it doesn't have to.

This means that std::nth_element can bail out early - as soon as it can tell what the n'th element of your range is going to be, it can stop. For instance, for a range

[9,3,6,2,1,7,8,5,4,0]

asking it to give you the fourth element may yield something like

[2,0,1,3,8,5,6,9,7,4]

The list was partially sorted, just good enough to be able to tell that the fourth element in order will be 3.

Hence, if you want to answer 'which number is the fourth-smallest' or 'which are the four smallest' numbers then std::nth_element is your friend.

If you want to get the four smallest numbers in order you may want to consider using std::partial_sort.


The implementation of std::nth_element looks as follows:

void _Nth_element(_RanIt _First, _RanIt _Nth, _RanIt _Last, _Pr _Pred)
{
    for (; _ISORT_MAX < _Last - _First; )
        {   // divide and conquer, ordering partition containing Nth
        pair<_RanIt, _RanIt> _Mid =
            _Unguarded_partition(_First, _Last, _Pred);

        if (_Mid.second <= _Nth)
            _First = _Mid.second;
        else if (_Mid.first <= _Nth)
            return; // Nth inside fat pivot, done
        else
            _Last = _Mid.first;
        }

    _Insertion_sort(_First, _Last, _Pred);  // sort any remainder
}

where ISORT_MAX defined as 32.

So if your sequence is shoter than 32 elements it just performs InsertionSort on it. Therefore your short sequence is full sorted.


std::sort sorts all the elements. std::nth_elenemt doesn't. It just puts the nth element in the nth positions, with smaller or equal elements on one side and larger or equal elements on the other. It is used if you want to find the nth element (obviously) or if you want the n smallest or largest elements. A full sort satisfies these requirements.

So why not just perform a full sort and get the nth element? Because std::nth_element has the requirement of having O(N) complexity, whereas std::sort is O(Nlog(N)). std::sort cannot satisfy the complexity requirement of std::nth_element. If you do not need complete sorting of the range, it is advantageous to use it.

As for your example, when I run similar code on GCC 4.7, I get the expected results:

  for ( int i = 0; i < 10; i++ )
    myvector.push_back(rand()%32); // make the numbers small

  cout << myvector << "\n";
// nth_element around the 4th element
  nth_element (myvector.begin(), myvector.begin()+4, myvector.end());
  cout << myvector << "\n";
  std::sort(myvector.begin(), myvector.end());
  cout << myvector << "\n";

produces

{ 7, 6, 9, 19, 17, 31, 10, 12, 9, 13 }
{ 9, 6, 9, 7, 10, 12, 13, 31, 17, 19 }
{ 6, 7, 9, 9, 10, 12, 13, 17, 19, 31 }
               ^

where I've used a custom made ostream operator<< to print out the results.

Tags:

C++