Difference between global and device functions

I will explain it with an example:

main()
{
    // Your main function. Executed by CPU
}

__global__ void calledFromCpuForGPU(...)
{
  //This function is called by CPU and suppose to be executed on GPU
}

__device__ void calledFromGPUforGPU(...)
{
  // This function is called by GPU and suppose to be executed on GPU
}

i.e. when we want a host(CPU) function to call a device(GPU) function, then 'global' is used. Read this: "https://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialGlobalFunctions"

And when we want a device(GPU) function (rather kernel) to call another kernel function we use 'device'. Read this "https://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions"

This should be enough to understand the difference.


  1. __global__ - Runs on the GPU, called from the CPU or the GPU*. Executed with <<<dim3>>> arguments.
  2. __device__ - Runs on the GPU, called from the GPU. Can be used with variabiles too.
  3. __host__ - Runs on the CPU, called from the CPU.

*) __global__ functions can be called from other __global__ functions starting
compute capability 3.5.


Differences between __device__ and __global__ functions are:

__device__ functions can be called only from the device, and it is executed only in the device.

__global__ functions can be called from the host, and it is executed in the device.

Therefore, you call __device__ functions from kernels functions, and you don't have to set the kernel settings. You can also "overload" a function, e.g : you can declare void foo(void) and __device__ foo (void), then one is executed on the host and can only be called from a host function. The other is executed on the device and can only be called from a device or kernel function.

You can also visit the following link: http://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions, it was useful for me.


Global functions are also called "kernels". It's the functions that you may call from the host side using CUDA kernel call semantics (<<<...>>>).

Device functions can only be called from other device or global functions. __device__ functions cannot be called from host code.

Tags:

Cuda