How to force openMP to run iterations in specific order
You can change the size of the iteration blocks each thread gets to 1
within the schedule
clause, e.g. schedule(static,1)
. With 3 threads the first one would process iterations 0, 3, 6, 9 and so on, the second thread would process iterations 1, 4, 7, 10 and so on, and the third one would process iterations 2, 5, 8, 11 and so on. You still need to synchronise somewhere in the loop since there is no guarantee that threads would execute all steps at the same time and at the same speed (you can put a barrier at the end of each iteration to synchronise before the next block of iterations starts).
Another solution is to use the OpenMP tasking construct. With it you can run a big loop in one thread, generating computational tasks. You can put checks for the existence of the output file inside this loop and create new tasks only if needed (e.g. the output file does not exist):
#pragma omp parallel
{
...
#pragma omp single
for (part = 0; part < P->Parts; part++)
{
if (!output_file_exists(part))
#pragma omp task
{
... computation for that part ...
}
}
#pragma omp taskwait
...
}
Hope I've understood your problem correctly.
If we want OpenMP threads to execute in order we must use the ordered
clause. However we must be careful. The following will print i
's (and thread id's) in order (i
from 0
to 19
, tid from 0
to omp_get_num_threads() - 1
) :
#pragma omp parallel
#pragma omp for ordered
for (i = 0; i < 20; i++)
#pragma omp ordered
printf("i=%d - tid=%d\n", i, omp_get_thread_num());
Output (in my 8 core intel x86_64 machine):
i=0 - tid=0
i=1 - tid=0
i=2 - tid=0
i=3 - tid=1
i=4 - tid=1
i=5 - tid=1
i=6 - tid=2
i=7 - tid=2
i=8 - tid=2
i=9 - tid=3
i=10 - tid=3
i=11 - tid=3
i=12 - tid=4
i=13 - tid=4
i=14 - tid=5
i=15 - tid=5
i=16 - tid=6
i=17 - tid=6
i=18 - tid=7
i=19 - tid=7
But notice:
#pragma omp parallel
#pragma omp for ordered
for (i = 0; i < 20; i++)
{
// the threads enter this for() section in order but won't
// print this statement in order!
printf("other i=%d - tid=%d\n", i, omp_get_thread_num());
#pragma omp ordered
// these are printed in order
printf("i=%d - tid=%d\n", i, omp_get_thread_num());
}
Output:
other i=16 - tid=6
other i=18 - tid=7
other i=12 - tid=4
other i=0 - tid=0
i=0 - tid=0
other i=1 - tid=0
i=1 - tid=0
other i=2 - tid=0
i=2 - tid=0
other i=3 - tid=1
other i=6 - tid=2
other i=14 - tid=5
i=3 - tid=1
other i=4 - tid=1
i=4 - tid=1
other i=5 - tid=1
i=5 - tid=1
i=6 - tid=2
other i=7 - tid=2
i=7 - tid=2
other i=8 - tid=2
i=8 - tid=2
other i=9 - tid=3
i=9 - tid=3
other i=10 - tid=3
i=10 - tid=3
other i=11 - tid=3
i=11 - tid=3
i=12 - tid=4
other i=13 - tid=4
i=13 - tid=4
i=14 - tid=5
other i=15 - tid=5
i=15 - tid=5
i=16 - tid=6
other i=17 - tid=6
i=17 - tid=6
i=18 - tid=7
other i=19 - tid=7
i=19 - tid=7
Lastly note that this array is being filled in order:
// threads filling up array
int Arr[20] = {0};
#pragma omp parallel for ordered
for (i = 0; i < 20; i++)
Arr[i] = i;
printf("\n\n");
// lets check to see if threads have put values to the array in order
for (i = 0; i < 20; i++)
printf("Arr[%d]=%d\n", i, Arr[i]);
Output:
A[0]=0
A[1]=1
A[2]=2
A[3]=3
A[4]=4
A[5]=5
A[6]=6
A[7]=7
A[8]=8
A[9]=9
A[10]=10
A[11]=11
A[12]=12
A[13]=13
A[14]=14
A[15]=15
A[16]=16
A[17]=17
A[18]=18
A[19]=19