Disassemble an OpenCL kernel?

I know that this is an old question, but in case someone comes looking here for disassembling a AMD GPU kernel, you can do the following in linux:

export GPU_DUMP_DEVICE_KERNEL=3

This make any kernel that is compiled on your machine dump the assembled code to a file in the same directory.

Source: http://dis.unal.edu.co/~gjhernandezp/TOS/GPU/ATI_Stream_SDK_OpenCL_Programming_Guide.pdf

Sections 4.2.1 and 4.2.2


The simplest solution, in my experience, is to use clangs OpenCL C compiler and emit SPIR. It even works on Godbolt's compiler explorer: https://godbolt.org/z/_JbXPb

Clang can also emit ptx (https://godbolt.org/z/4ARMqM) and amdhsa (https://godbolt.org/z/TduTZQ), but it may not correspond to the ptx and amdhsa assembly generated by the respective driver at runtime.


If you're using NVIDIA's OpenCL implementation for their GPUs, you can do the followings to disassemble an OpenCL kernel:

  1. Use clGetEventProfilingInfo() to dump the ptx code to a file, say ptxfile.ptx. Please refer to the OpenCL specification to have more details on this function.

  2. Use nvcc to compile ptx to cubin file, for example: nvcc -cubin -arch=sm_20 ptxfile.ptx will compile ptxfile.ptx onto a compute capability 2.0 device.

  3. Use cuobjdump to disassemble the cubin file into GPU instructions. For example: cuobjdump -sass ptxfile.cubin

Hope this helps.