Efficiency of postincrement v.s. preincrement in C++
One point of data from my experience.
Changing a post-increment to a pre-increment of a std::map::iterator
in for
loops resulted in noticeable savings in a core algorithm at my work.
In general, when icrementing an iterator that is a class, i.e. it is not a pointer, you should notice savings when using the pre-increment operator. The reason for it is that the pre-increment operator function changes the object in place while the post increment operator function usually involves creation of a temporary object.
A pre-increment operator is usually implemented as:
typename& typename::operator++()
{
// Change state
...
// Return the object
return *this;
}
while a post-increment operator is usually implemented as:
typename typename::operator++(int)
{
// Create a temporary object that is a copy of the current object.
typename temp(*this):
// Change state of the current object
...
// Return the temporary object.
return temp;
}
pre-increment introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression."
Is this true?
It is mostly true - although perhaps overly strict. Pre increment doesn't necessarily introduce a data dependency - but it can.
A trivial example for exposition:
a = b++ * 2;
Here, the increment can be executed in parallel with the multiplication. The operands of both the increment and the multiplication are immediately available and do not depend on the result of either operation.
Another example:
a = ++b * 2;
Here, the multiplication must be executed after the increment, because one of the operands of the multiplication depends on the result of the increment.
Of course, these statements do slightly different things, so the compiler might not always be able to transform the program from one form to the other while keeping the semantics the same - which is why using the post-increment might make a slight difference in performance.
A practical example, using a loop:
for(int i= 0; arr[i++];)
count++;
for(int i=-1; arr[++i];) // more typically: (int i=0; arr[i]; ++i;)
count++;
One might think that the latter is necessarily faster if they reason that "post-increment makes a copy" - which would have been very true in the case of non-fundamental types. However, due to the data dependency (and because int
is a fundamental type with no overload function for increment operators), the former can theoretically be more efficient. Whether it actually is depends on the CPU architecture, and the ability of the optimizer.
For what it's worth - in a trivial program, on x86 arch, using g++ compiler with optimization enabled, the above loops had identical assembly output, so they are perfectly equivalent in that case.
Rules of thumb:
If the counter is a fundamental type and the result of increment is not used, then it makes no difference whether you use post/pre-increment.
If the counter is not a fundamental type and the result of the increment is not used and optimizations are disabled, then pre-increment may be more efficient. With optimizations enabled, there is usually no difference.
If the counter is a fundamental type and the result of increment is used, then post-increment can theoretically be marginally more efficient - in some CPU architecture - in some context - using some compiler.
If the counter is not a fundamental type and the result of the increment is used, then pre-increment is typically faster than post-increment. Also, see R Sahu's answer regarding this case.