Apple - How can I have 1805 threads when I only have 4 virtual CPUs?
Scheduling
Your 1,805 threads do not run simultaneously. They trade off. One core runs a bit of the thread, then sets it aside to execute a bit of another thread. The other cores do the same. Round and round, threads execute a little bit at a time, not all at once.
A major responsibility of the operating system (Darwin and macOS) is the scheduling of which thread is to be executed on which core for how long.
Many threads have no work to do, and so are left dormant and unscheduled. Likewise, many threads may be waiting on some resource such as data to be retrieved from storage, or a network connection to be completed, or data to be loaded from a database. With almost nothing to do but check the status of the awaited resource, such threads are scheduled quite briefly if at all.
The app programmer can assist this scheduling operation by sleeping her thread for a certain amount of time when she knows the wait for the external resource will take some time. And if running a “tight” loop that is CPU-intensive with no cause to wait on external resources, the programmer can insert a call to volunteer to be set aside briefly so as to not hog the core and thereby allow other threads to execute.
For more detail, see the Wikipedia page for multithreading.
Simultaneous Multi-Threading
As for your linked Question, threads there are indeed the same as here.
One issue there is the overhead cost of switching between threads when scheduled by the OS. There is a significant cost in time to unload the current thread’s instructions and data from the core and then load next scheduled thread ’s instructions and data. Part of the operating system’s job is to try to be smart in scheduling the threads as to optimize around this overhead cost.
Some CPU makers have developed technology to cut this time so as to make switching between a pair of threads much faster. Intel calls their technology Hyper-Threading. Known generically as simultaneous multi-threading (SMT).
While the pair of threads does not actually execute simultaneously, the switching is so smooth and fast that both threads appear to be virtually simultaneous. This works so well that the each core presents itself as a pair of virtual cores to the OS. So a SMT enabled CPU with four physical cores, for example, will present itself to the OS as an eight-core CPU.
Despite this optimization, there is still some overhead to switching between such virtual cores. Too many CPU-intensive threads all clamoring for execution time to be scheduled on a core can make the system inefficient, with no one thread getting much work done. Like three balls on a playground being shared between nine kids, versus sharing between nine hundred kids where no one kid really gets any serious playtime with a ball.
So there is an option in the CPU firmware where a sysadmin can throw a switch on the machine to disable SMT if she decides it would benefit her users running an app that is unusually CPU-bound with very few opportunities for pausing.
In such a case we come back to your original Question: In this special situation you would indeed want to constrain the operations to have no more of these hyper-active threads than you have physical cores. But let me repeat: this is an extremely unusual situation that might occur in something like a specialized scientific data-crunching project but would almost never apply to common business/corporate/enterprise scenarios.
In the old days - memory wasn't virtualized or protected and any code could write anywhere. In those days a one thread to one CPU design made sense. In the decades since that, memory first was protected and then virtualized. Think of threads as virtual cores - kind of a promise that at some time when your data and code was ready, that thread gets pushed (or scheduled as the PHD engineers and mathmeticians that do research on scheduling algorithms call it) onto an actual CPU to do actual work.
- https://people.eecs.berkeley.edu/%7Ercs/research/interactive_latency.html
Now - due to the magnitudes of difference in timing - the CPU and cache operate so fast compared to getting data from storage or the network - that thousands of threads could come and go while one thread is waiting for www.google.com to deliver a packet or two of data so that's the reason why you see so many more threads than actual CPU.
If you convert the thread operations which happen on the black / blue time scale and convert them to one second = 1 ns, things we care about are more like disk IO take 100 microseconds are like 4 days and a 200 ms internet round trip is a 20 year delay if you're counting seconds on the CPU time scale. Like many powers of ten exercises, in almost all cases - the CPU sits idle for "months" waiting for meaningful work from a very, very slow external world.
Nothing seems amiss in the image you posted so perhaps we're misunderstanding what you are getting at by wondering about threads.
If you right click (control click) on the word threads in the header row at top, add the status of the app and you will see most threads are likely idle, sleeping, not running at any given moment.