XIO: fatal IO error 11 caused by 32-bit libxcb

I just had a program that acted exactly like this, with exactly the same error message. I would expect the counter error to process 2^32 events before crashing.

The program was structured so that a worker thread has a separate X connection to the X thread so that it can send messages to the X thread to update the window.

In the end I traced the problem down to a place where the function sending the events to the window to redraw it was called by multiple threads, without a mutex on it, and since X to the same X connection is not re-entrant, crashed with this exact error. Put in a mutex on the function and no problems since.


OK, I finally found the cause (thanks to someone at National Instruments), a better diagnostic and a workaround.

The bug is in many versions of libxcb and is a 32-bit counter rollover problem that has been known for a few years: https://bugs.freedesktop.org/show_bug.cgi?id=71338

Not all versions of libxcb are affected libxcb-1.9-5 has it, libxcb-1.5-1 doesn't. From the bug list, 64-bits OS shouldn't be affected, but I managed to trigger it on at least one version.

Which brings me to a better diagnostic. The following program will crash in less than 15 minutes on affected libraries (better than the entire week it previously took):

// Compile with: gcc test.c -lX11 && time ./a.out
#include <X11/Xlib.h>
void main(void) {
    Display *d = XOpenDisplay(NULL);
    if (d)
     for(;;)
        XNoOp(d);
}

And one final thing, the above prog compiled and ran on a 64-bit system works fine, compiled and ran on an old 32-bit system also works fine, but if I transfer the 32-bit version to the 64-bit system, it crashes after a few minutes.

Tags:

X11

Core