Repeat a block of code a fixed number of times

It would be entirely possible to have a repeat(x) as part of the language, but there isn't such a thing for some reason - the design of C and C++ does somewhat follow what the processors can do, and I'm not familiar with a single processor (I've worked with about 10 different processor architectures) that can do a "loop this many times" without some sort of "check if we reached the number".

So, you will have to write some code that checks how many times you've repeated something (or, how many times there is left to do - there is an x86 instruction called "loop" that does just that - counts down, and if the counter is not zero, jump to beginning of the loop).

If the compiler wishes to then "unroll" a loop because it has a constant number of iterations, and it decides "unrolling this is faster" [compilers decide these sort of things all the time, and often get it right], then the compiler may well do so. But you still have to write code that "checks".

If you want the syntactic nicety of being able to write repeat(x) {} then you could use a macro.

Something like:

#include <iostream>

#define repeat(x) for(int i = x; i--;)

int main()
{
    repeat(10) 
    {
        std::cout << i << std::endl;
    }

    return 0;
}

The implementation here also uses a comparison to zero in the for loop rather than the less than operator, which may be slightly faster.

Your attempts to optimize the loop by using some construct (incl. manually cutting & pasting the code) to optimize the loop's execution speed are ill-advised. Don't do it; it would probably "un-optimize" the execution speed instead.

In any C++ implementation I've ever encountered (MSVC 6.0, 2003, 2005, 2010, GCC various versions, Diab various versions), there is absolutely zero, sorry I didn't stress that enough, ZERO, time involved with allocating a loop counting variable, assuming any other variables were allocated for the function in which the loop counting variable is allocated. For a simple loop that makes no function calls, the loop counting variable may never even make it out to memory; it may be held entirely in a single CPU register for its entire lifetime. Even if it is stored in memory, it would be on the runtime stack, and space for it (and any other local variables) would be claimed all at once in a single operation, which takes no more or less time depending on the number of variables allocated on the stack. Local variables like your loop counter variable are allocated on the stack, and stack allocations are CHEAP CHEAP CHEAP, as opposed to heap allocations.

Example loop counter variable allocation on the stack:

for (int i=0; i<50; ++i) {
    ....
}

Another example loop counter variable allocation on the stack:

int i = 0;
for (; i<50; ++i) {
    ....
}

Example loop counter variable allocated on the heap (don't do this; it's stupid):

int* ip = new int;
for (*ip=0; *ip<50; ++(*ip)) {
    ....
}
delete ip;

Now to address the issue of attempting to optimize your loop by manually copying & pasting instead of using a loop & counter:

What you're considering doing is a manual form of loop unrolling. Loop unrolling is an optimization that compilers sometimes use for reducing the overhead involved in a loop. Compilers can do it only if the number of iterations of the loop can be known at compile time (i.e. the number of iterations is a constant, even if the constant involves computation based on other constants). In some cases, the compiler may determine that it is worthwhile to unroll the loop, but often it won't unroll it completely. For instance, in your example, the compiler may determine that it would be a speed advantage to unroll the loop from 50 iterations out to only 10 iterations with 5 copies of the loop body. The loop variable would still be there, but instead of doing 50 comparisons of the loop counter, now the code only has to do the comparison 10 times. It's a tradeoff, because the 5 copies of the loop body eat up 5 times as much space in the cache, which means that loading those extra copies of the same instructions forces the cache to evict (throw out) that many instructions that are already in the cache and which you might have wanted to stay in the cache. Also, loading those 4 extra copies of the loop body instructions from main memory takes much, much longer than simply grabbing the already-loaded instructions from the cache in the case where the loop isn't unrolled at all.

So all in all, it's often more advantageous to just use only one copy of the loop body and go ahead and leave the loop logic in place. (I.e. don't do any loop unrolling at all.)

Repeat a block of code a fixed number of times

Tags:

Performance

C++

Repeat

Related

Recent Posts