How to debug the Linux kernel with GDB and QEMU?

I'd try:

(gdb) target remote localhost:1234
(gdb) continue

Using the '-s' option makes qemu listen on port tcp::1234, which you can connect to as localhost:1234 if you are on the same machine. Qemu's '-S' option makes Qemu stop execution until you give the continue command.

Best thing would probably be to have a look at a decent GDB tutorial to get along with what you are doing. This one looks quite nice.


Step-by-step procedure tested on Ubuntu 16.10 host

To get started from scratch quickly I've made a minimal fully automated QEMU + Buildroot example at: https://github.com/cirosantilli/linux-kernel-module-cheat/blob/c7bbc6029af7f4fab0a23a380d1607df0b2a3701/gdb-step-debugging.md Major steps are covered below.

First get a root filesystem rootfs.cpio.gz. If you need one, consider:

  • a minimal init-only executable image: https://unix.stackexchange.com/questions/122717/custom-linux-distro-that-runs-just-one-program-nothing-else/238579#238579
  • a Busybox interactive system: https://unix.stackexchange.com/questions/2692/what-is-the-smallest-possible-linux-implementation/203902#203902

Then on the Linux kernel:

git checkout v4.15
make mrproper
make x86_64_defconfig
cat <<EOF >.config-fragment
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KERNEL=y
CONFIG_GDB_SCRIPTS=y
EOF
./scripts/kconfig/merge_config.sh .config .config-fragment
make -j"$(nproc)"
qemu-system-x86_64 -kernel arch/x86/boot/bzImage \
                   -initrd rootfs.cpio.gz -S -s \
                   -append nokaslr

On another terminal, from inside the Linux kernel tree, supposing you want to start debugging from start_kernel:

gdb \
    -ex "add-auto-load-safe-path $(pwd)" \
    -ex "file vmlinux" \
    -ex 'set arch i386:x86-64:intel' \
    -ex 'target remote localhost:1234' \
    -ex 'break start_kernel' \
    -ex 'continue' \
    -ex 'disconnect' \
    -ex 'set arch i386:x86-64' \
    -ex 'target remote localhost:1234'

and we are done!!

For kernel modules see: How to debug Linux kernel modules with QEMU?

For Ubuntu 14.04, GDB 7.7.1, hbreak was needed, break software breakpoints were ignored. Not the case anymore in 16.10. See also: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/901944

The messy disconnect and what come after it are to work around the error:

Remote 'g' packet reply is too long: 000000000000000017d11000008ef4810120008000000000fdfb8b07000000000d352828000000004040010000000000903fe081ffffffff883fe081ffffffff00000000000e0000ffffffffffe0ffffffffffff07ffffffffffffffff9fffff17d11000008ef4810000000000800000fffffffff8ffffffffff0000ffffffff2ddbf481ffffffff4600000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f0300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000

Related threads:

  • https://sourceware.org/bugzilla/show_bug.cgi?id=13984 might be a GDB bug
  • Remote 'g' packet reply is too long
  • http://wiki.osdev.org/QEMU_and_GDB_in_long_mode osdev.org is as usual an awesome source for these problems
  • https://lists.nongnu.org/archive/html/qemu-discuss/2014-10/msg00069.html
  • nokaslr: https://unix.stackexchange.com/questions/397939/turning-off-kaslr-to-debug-linux-kernel-using-qemu-and-gdb/421287#421287

Known limitations:

  • the Linux kernel does not support (and does not even compile without patches) with -O0: How to de-optimize the Linux kernel to and compile it with -O0?
  • GDB 7.11 will blow your memory on some types of tab completion, even after the max-completions fix: Tab completion interrupt for large binaries Likely some corner case which was not covered in that patch. So an ulimit -Sv 500000 is a wise action before debugging. Blew up specifically when I tab completed file<tab> for the filename argument of sys_execve as in: https://stackoverflow.com/a/42290593/895245

See also:

  • https://github.com/torvalds/linux/blob/v4.9/Documentation/dev-tools/gdb-kernel-debugging.rst official Linux kernel "documentation"
  • Linux kernel live debugging, how it's done and what tools are used?

BjoernID's answer did not really work for me. After the first continuation, no breakpoint is reached and on interrupt, I would see lines such as:

0x0000000000000000 in ?? ()
(gdb) break rapl_pmu_init
Breakpoint 1 at 0xffffffff816631e7
(gdb) c
Continuing.
^CRemote 'g' packet reply is too long: 08793000000000002988d582000000002019[..]

I guess this has something to do with different CPU modes (real mode in BIOS vs. long mode when Linux has booted). Anyway, the solution is to run QEMU first without waiting (i.e. without -S):

qemu-system-x86_64 -enable-kvm -kernel arch/x86/boot/bzImage -cpu SandyBridge -s

In my case, I needed to break at something during boot, so after some deciseconds, I ran the gdb command. If you have more time (e.g. you need to debug a module that is loaded manually), then the timing doesn't really matter.

gdb allows you to specify commands that should be run when started. This makes automation a bit easier. To connect to QEMU (which should now already be started), break on a function and continue execution, use:

gdb -ex 'target remote localhost:1234' -ex 'break rapl_pmu_init' -ex c ./vmlinux