Why is a single thread spread across CPU's?
I think wierob
has described the point fairly well.
Here is an older article discussing processor affinity
settings with a quad-core QX6800.
(the link points to the second page of that article).
If you do not force process affinity to a core do you loose on performance?
- While the Windows scheduler needs to decide such affinity to avoid thrashing with caches,
the processor design itself also considers such things. - The Intel QX6800 quad-core (since i refer it earlier in this answer)
has an 8MBL3
cache shared across its 4 cores.
It should be noted that while you may have chosen to run just this one single-threaded process on the system, the OS itself would have several other tasks running which also need to be scheduled. The scheduler balances all this activity across the available processor pool (or cores).
Going forward, with the Nehalem architecture and NUMA,
processors across multiple sockets will also be able to better address access thrash.
Here is a quick picture from an ArsTechnica page on NUMA.
If Nehalem and i7
interest you, I have some more links at this answer.
The scheduler just executes the next thread that is ready for execution on a "free" core/CPU.
You can assign a process to a specific CPU via the Windows task manager.
Having 4 cores at 25% means that 4 threads are executed simultaneously. Whereas, one core at x% means that only one thread is executed. So the former is more efficient in some cases.
But during its execution the cache of the CPU is filled with data accessed by the thread. So if the thread gets executed on another CPU, it will experience more cache misses, which are costly, since the data is not in the cache of this CPU.
What does your thread do? If the thread "sleeps" for a very short time the core it was executed on before might be occupied by another threat and thus your thread is executed on the next available core. What happens if you specify only one core to be used by your process (e.g. ia task manager)?