Does using a lock have better performance than using a local (single application) semaphore?

Lock(obj) is the same as Monitor.Enter(obj); A lock is basicaly an unary semaphore. If you have a number of instances of the same ressource (N) you use a semaphore with the initialization value N. A lock is mainly used to ensure that a code section is not executed by two threads at the same time.

So a lock can be implemented using a semaphore with initialization value of 1. I guess that Monitor.Enter is more performant here but I have no real information about that. A test will be of help here. Here is a SO thread that handels about performance.

For your problem a blocking queue would be the solution. (producer consumer) I suggest this very good SO thread.

Here is another good source of information about Reusable Parallel Data Structures.


TLDR I just ran my own benchmark and in my setup, it seems that lock is running almost twice as fast as SemaphoreSlim(1).

Specs:

  • .NET Core 2.1.5
  • Windows 10
  • 2 physical cores (4 logical) @2.5 GHz

The test:

I tried running 2, 4 and 6 Tasks in parallel, each of them doing 1M of operations of accessing a lock, doing a trivial operation and releasing it. The code looks as follows:

await semaphoreSlim1.WaitAsync();
// other case: lock(obj) {...}

if(1 + 1 == 2)
{
    count++;
}        

semaphoreSlim1.Release();

Results For each case, lock ran almost twice as fast as SemaphoreSlim(1) (e.g. 205ms vs 390ms, using 6 parallel tasks).

Please note, I do not claim that it is any faster on an infinite number of other setups.


In general: If your consumer thread manages to process each data item quickly enough, then the kernel-mode transition will incur a (possibly significant) bit of overhead. In that case a user-mode wrapper which spins for a while before waiting on the semaphore will avoid some of that overhead.

A monitor (with mutual exclusion + condition variable) may or may not implement spinning. That MSDN article's implementation didn't, so in this case there's no real difference in performance. Anyway, you're still going to have to lock in order to dequeue items, unless you're using a lock-free queue.