Disassemble an OpenCL kernel?
I know that this is an old question, but in case someone comes looking here for disassembling a AMD GPU kernel, you can do the following in linux:
export GPU_DUMP_DEVICE_KERNEL=3
This make any kernel that is compiled on your machine dump the assembled code to a file in the same directory.
Source: http://dis.unal.edu.co/~gjhernandezp/TOS/GPU/ATI_Stream_SDK_OpenCL_Programming_Guide.pdf
Sections 4.2.1 and 4.2.2
The simplest solution, in my experience, is to use clangs OpenCL C compiler and emit SPIR. It even works on Godbolt's compiler explorer: https://godbolt.org/z/_JbXPb
Clang can also emit ptx (https://godbolt.org/z/4ARMqM) and amdhsa (https://godbolt.org/z/TduTZQ), but it may not correspond to the ptx and amdhsa assembly generated by the respective driver at runtime.
If you're using NVIDIA's OpenCL implementation for their GPUs, you can do the followings to disassemble an OpenCL kernel:
Use
clGetEventProfilingInfo()
to dump the ptx code to a file, sayptxfile.ptx
. Please refer to the OpenCL specification to have more details on this function.Use nvcc to compile ptx to cubin file, for example:
nvcc -cubin -arch=sm_20 ptxfile.ptx
will compileptxfile.ptx
onto a compute capability 2.0 device.Use
cuobjdump
to disassemble the cubin file into GPU instructions. For example:cuobjdump -sass ptxfile.cubin
Hope this helps.