Are GPU/CUDA cores SIMD ones?

CUDA "cores" can be thought of as SIMD lanes.

First let's recall that the term "CUDA core" is nVIDIA marketing-speak. These are not cores the same way a CPU has cores. Similarly, "CUDA threads" are not the same as the threads we know on CPUs.

The equivalent of a CPU core on a GPU is a "symmetric multiprocessor": It has its own instruction scheduler/dispatcher, its own L1 cache, its own shared memory etc. It is CUDA thread blocks rather than warps that are assigned to a GPU core, i.e. to a streaming multiprocessor. Within an SM, warps get selected to have instructions scheduled, for the entire warp. From a CUDA perspective, those are 32 separate threads, which are instruction-locked; but that's really no different than saying that a warp is like a single thread, which only executes 32-lane-wide SIMD instructions. Of course this isn't a perfect analogy, but I feel it's pretty sound. Something you don't quite / don't always have on CPU SIMD lanes is a masking of which lanes are actively executing, where inactive lanes will have not have the effect of active lanes' setting of register values, memory writes etc.

I hope this helps you makes intuitive sense of things.


Each warp is assigned to only one core (is that true?).

No, it's not true. A warp is a logical assembly of 32 threads of execution. To execute a single instruction from a single warp, the warp scheduler must usually schedule 32 execution units (or "cores", although the definition of a "core" is somewhat loose).

Cores are in fact scalar processors, not vector processors. 32 cores (or execution units) are marshalled by the warp scheduler to execute a single instruction, across 32 threads, which is where the "SIMT" moniker comes from.

Tags:

Cuda

Gpgpu

Simd

Gpu