Difference between std::lock_guard and #pragma omp critical
The critical section serves the same purpose as acquiring a lock (and will probably use a lock internally).
std::mutex
is standard C++ feature whereas#pragma omp critical
is an OpenMP extension and not defined by the standard.The critical section names are global to the entire program (regardless of module boundaries). So if you have a critical section by the same name in multiple modules, not two of them can be executed at the same time. If the name is omitted, a default name is assumed. (docs).
Would prefer standard C++, unless there is a good reason to use the other (after measuring both).
Not direct targeting the question, but there is also another problem with this loop: the lock is executed on each loop iteration. This degrades performance significantly (look also at this answer).
From cppreference.com
about lock_guard one can read
The class lock_guard is a mutex wrapper that provides a convenient RAII-style mechanism for owning a mutex for the duration of a scoped block.
and from the OpenMP
standard about the critical one can read:
The critical construct restricts execution of the associated structured block to a single thread at a time.
So, both mechanism provide means to deal with the same problem i.e., ensure the mutual exclusion of a block of code.
Are they equally good or does one of them has some fallbacks?
Both are coarser grain locking-mechanisms, however, by default, the OpenMP critical
is even more coarser grain since:
All critical constructs without a name are considered to have the same unspecified name.
Therefore, if a name is not specified all critical regions use the same global lock, which would be semantically the same as using lock_guard
with the same mutex
. Nonetheless, one can along with the critical
pragma specify a name:
An optional name may be used to identify the critical construct.
#pragma omp critical(name)
Specifying the name
on a critical
is semantically similar to passing the lock to std::lock_guard<std::mutex> lock(name);
.
Worth nothing that OpenMP also offers explicitly locking mechanism such as omp_lock_t (some details in this SO Thread).
Notwithstanding, whenever possible you should aim for finer grain synchronization mechanism than a critical region, namely reduction, atomics or even using data redundancy. For instance, in your code snippet, the most performance approach would have been to use the reduction
clause, like so:
#pragma omp parallel for(+:someVar)
for (int i = 0; i < 1000; i++)
{
++someVar;
}
- When to use a mutex instead of #pragma omp critical?
IMO this should never be a consideration, first because as pointed out by none other then Michael Klemm:
One that thing that should be noted: "#pragma omp critical" can only interact with other "critical" constructs. You cannot mix C++ locks and OpenMP locks (lock API or "critical" constructs) with C++ locks like std::mutex. So, you there's code that is protected using std::mutex (or std::lock_guard on top), then other OpenMP code that should be mutual exclusively needs to also use std::mutex (and vice versa).
and furthermore as Gilles pointed out (which I also shared the same opinion):
As a matter of principle, mixing two different parallelism models is a bad idea. So if you use OpenMP parallelism, avoid using the C++ one as interactions between the two might be unexpected.