Why is kworker consuming so many resources on Linux 3.0.0-12-server?
I found this thread on lkml that answers your question a little. (It seems even Linus himself was puzzled as to how to find out the origin of those threads.)
Basically, there are two ways of doing this:
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
(wait a few secs)
For this you will need ftrace to be compiled in your kernel, and to enable it with:
mount -t debugfs nodev /sys/kernel/debug
More information on the function tracer facilities of Linux is available in the ftrace.txt documentation.
This will output what threads are all doing, and is useful for tracing multiple small jobs.
cat /proc/THE_OFFENDING_KWORKER/stack
This will output the stack of a single thread doing a lot of work. It may allow you to find out what caused this specific thread to hog the CPU (for example). THE_OFFENDING_KWORKER
is the pid of the kworker in the process list.
The solution is: I don't know how to find the cause. Nobody told me so far.
But talking with the BTRFS developers revealed a bug in the btrfs drivers when writing many many small files in a very short time period. This is an issue on kernels from 3.0 upto 3.1. Maybe it gets fixed in 3.2.
In the meantime I got a patch for the current kernel that solved the problem.
I had a similar issue; looking at the kworker's thread stack:
while true ; do clear ; grep -n ^ /proc/24910/stack | sort -rn | cut -d: -f 2- ; sleep 1 ; done
[<ffffffffffffffff>] 0xffffffffffffffff
[<ffffffff810908f0>] kthread+0x0/0xe0
[<ffffffff81576432>] ret_from_fork+0x42/0x70
[<ffffffff810908f0>] kthread+0x0/0xe0
[<ffffffff810909b1>] kthread+0xc1/0xe0
[<ffffffff8108b520>] worker_thread+0x0/0x550
[<ffffffff8108b573>] worker_thread+0x53/0x550
[<ffffffff8108aa4b>] process_one_work+0x14b/0x420
[<ffffffff81405a3d>] rpm_idle+0x1ad/0x220
[<ffffffff8140509d>] __rpm_callback+0x2d/0xb0
[<ffffffffa01aef16>] usb_runtime_idle+0x26/0x30 [usbcore]
[<ffffffffa01aeef0>] usb_runtime_idle+0x0/0x30 [usbcore]
[<ffffffff8140686c>] __pm_runtime_suspend+0x5c/0x90
[<ffffffff81405b19>] __pm_runtime_idle+0x69/0x90
[<ffffffff81405295>] rpm_suspend+0x105/0x620
[<ffffffff8140513f>] rpm_callback+0x1f/0x70
[<ffffffff8140509d>] __rpm_callback+0x2d/0xb0
[<ffffffffa01aee50>] usb_runtime_suspend+0x0/0x80 [usbcore]
[<ffffffffa01aee7e>] usb_runtime_suspend+0x2e/0x80 [usbcore]
[<ffffffffa01adc4f>] usb_suspend_both+0xef/0x1f0 [usbcore]
[<ffffffffa01adb06>] usb_resume_interface.isra.6+0xa6/0x100 [usbcore]
[<ffffffffa01a0c63>] hub_resume+0x23/0x60 [usbcore]
[<ffffffffa01a0636>] hub_activate+0xc6/0x5c0 [usbcore]
[<ffffffffa01a9d3f>] usb_kill_urb+0x3f/0xa0 [usbcore]
[<ffffffffa019d249>] hub_port_status+0xd9/0x120 [usbcore]
[<ffffffff81088a4f>] __queue_work+0x12f/0x340
[<ffffffff810888b6>] insert_work+0x46/0xb0
[<ffffffffa01aa6d4>] usb_control_msg+0xc4/0x110 [usbcore]
[<ffffffffa01aa55a>] usb_start_wait_urb+0x9a/0x150 [usbcore]
[<ffffffff810a36f7>] update_curr+0xd7/0x120
I figured it's the usb modules. I had previously on another machine blithely rmmod'd all usb and [uex]hci modules are realised that I had turned off the keyboard (not even ctrl-shift-sysrq-U !), so I ended up doing this:
MODS="uvcvideo ohci_hcd ehci_hcd xhci_hcd ohci_pci ehci_pci xhci_pci usbcore"
( echo $MODS $MODS | xargs -n 1 rmmod -v ; sleep 3 ; echo $MODS | xargs -n 1 modprobe -v ; )
root@hp:~# ( echo $MODS $MODS | xargs -n 1 rmmod -v ; sleep 3 ; echo $MODS | xargs -n 1 modprobe -v ; )
rmmod: ERROR: Module ohci_hcd is in use by: ohci_pci
rmmod: ERROR: Module ehci_hcd is in use by: ehci_pci
rmmod: ERROR: Module xhci_hcd is in use by: xhci_pci
rmmod: ERROR: Module uvcvideo is not currently loaded
rmmod: ERROR: Module ohci_pci is not currently loaded
rmmod: ERROR: Module ehci_pci is not currently loaded
rmmod: ERROR: Module xhci_pci is not currently loaded
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/media/usb/uvc/uvcvideo.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/ehci-hcd.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/ohci-hcd.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/xhci-hcd.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/ehci-pci.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/ohci-pci.ko
insmod /lib/modules/4.1.0-2-amd64/kernel/drivers/usb/host/xhci-pci.ko
did the trick:
grep -n ^ /proc/24910/stack | sort -rn | cut -d: -f 2-
[<ffffffffffffffff>] 0xffffffffffffffff
[<ffffffff810908f0>] kthread+0x0/0xe0
[<ffffffff81576432>] ret_from_fork+0x42/0x70
[<ffffffff810908f0>] kthread+0x0/0xe0
[<ffffffff810909b1>] kthread+0xc1/0xe0
[<ffffffff8108b520>] worker_thread+0x0/0x550
[<ffffffff8108b5ec>] worker_thread+0xcc/0x550
So my chief suspect is this gadget: RTL8723B* WIFI+Bluetooth Module. I'm wondering now if the power-management code realises it's the same device if it tries to e.g. powerdown an unused BT adapter.
context:
root@hp:~# lsusb
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 002: ID 0c45:651b Microdia
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 002: ID 0bda:b001 Realtek Semiconductor Corp.
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 009 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 008 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
root@hp:~# lsmod | grep usb
btusb 45056 0
btbcm 16384 1 btusb
btintel 16384 1 btusb
bluetooth 438272 5 bnep,btbcm,btusb,btintel
usbcore 200704 8 btusb,uvcvideo,ohci_hcd,ohci_pci,ehci_hcd,ehci_pci,xhci_hcd,xhci_pci
usb_common 16384 1 usbcore
root@hp:~# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux stable-updates (sid)
Release: stable-updates
Codename: sid
root@hp:~# uname -a
Linux hp 4.1.0-2-amd64 #1 SMP Debian 4.1.6-1 (2015-08-23) x86_64 GNU/Linux
root@hp:~# dmesg | tail -n 20
[97865.088740] usb 2-4: SerialNumber: HP Webcam
[97865.091557] uvcvideo: Found UVC 1.00 device HP Webcam (0c45:651b)
[97865.105948] input: HP Webcam as /devices/pci0000:00/0000:00:13.2/usb2/2-4/2-4:1.0/input/input17
[97865.189817] usb 3-3: new full-speed USB device number 2 using ohci-pci
[97865.350981] usb 3-3: No LPM exit latency info found, disabling LPM.
[97865.368958] usb 3-3: New USB device found, idVendor=0bda, idProduct=b001
[97865.368969] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[97865.368976] usb 3-3: Product: Bluetooth Radio
[97865.368981] usb 3-3: Manufacturer: Realtek
[97865.368985] usb 3-3: SerialNumber: 00e04c000001
[97865.375859] Bluetooth: hci0: rtl: examining hci_ver=06 hci_rev=000b lmp_ver=06 lmp_subver=8723
[97865.375867] Bluetooth: hci0: rtl: loading rtl_bt/rtl8723b_fw.bin
[97865.375896] usb 3-3: firmware: failed to load rtl_bt/rtl8723b_fw.bin (-2)
[97865.375902] usb 3-3: Direct firmware load for rtl_bt/rtl8723b_fw.bin failed with error -2
[97865.375907] Bluetooth: hci0: Failed to load rtl_bt/rtl8723b_fw.bin
[97865.397812] Bluetooth: hci0: rtl: examining hci_ver=06 hci_rev=000b lmp_ver=06 lmp_subver=8723
[97865.397821] Bluetooth: hci0: rtl: loading rtl_bt/rtl8723b_fw.bin
[97865.397850] usb 3-3: firmware: failed to load rtl_bt/rtl8723b_fw.bin (-2)
[97865.397856] usb 3-3: Direct firmware load for rtl_bt/rtl8723b_fw.bin failed with error -2
[97865.397861] Bluetooth: hci0: Failed to load rtl_bt/rtl8723b_fw.bin