How can I monitor what's being put into the standard out buffer and break when a specific string is deposited in the pipe?

This question might be a good starting point: how can I put a breakpoint on "something is printed to the terminal" in gdb?

So you could at least break whenever something is written to stdout. The method basically involves setting a breakpoint on the write syscall with a condition that the first argument is 1 (i.e. STDOUT). In the comments, there is also a hint as to how you could inspect the string parameter of the write call as well.

x86 32-bit mode

I came up with the following and tested it with gdb 7.0.1-debian. It seems to work quite well. $esp + 8 contains a pointer to the memory location of the string passed to write, so first you cast it to an integral, then to a pointer to char. $esp + 4 contains the file descriptor to write to (1 for STDOUT).

$ gdb break write if 1 == *(int*)($esp + 4) && strcmp((char*)*(int*)($esp + 8), "your string") == 0

x86 64-bit mode

If your process is running in x86-64 mode, then the parameters are passed through scratch registers %rdi and %rsi

$ gdb break write if 1 == $rdi && strcmp((char*)($rsi), "your string") == 0

Note that one level of indirection is removed since we're using scratch registers rather than variables on the stack.

Variants

Functions other than strcmp can be used in the above snippets:

  • strncmp is useful if you want match the first n number of characters of the string being written
  • strstr can be used to find matches within a string, since you can't always be certain that the string you're looking for is at the beginning of string being written through the write function.

Edit: I enjoyed this question and finding it's subsequent answer. I decided to do a blog post about it.


catch + strstr condition

The cool thing about this method is that it does not depend on glibc write being used: it traces the actual system call.

Furthermore, it is more resilient to printf() buffering, as it might even catch strings that are printed across multiple printf() calls.

x86_64 version:

define stdout
    catch syscall write
    commands
        printf "rsi = %s\n", $rsi
        bt
    end
    condition $bpnum $rdi == 1 && strstr((char *)$rsi, "$arg0") != NULL
end
stdout qwer

Test program:

#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    write(STDOUT_FILENO, "asdf1", 5);
    write(STDOUT_FILENO, "qwer1", 5);
    write(STDOUT_FILENO, "zxcv1", 5);
    write(STDOUT_FILENO, "qwer2", 5);
    printf("as");
    printf("df");
    printf("qw");
    printf("er");
    printf("zx");
    printf("cv");
    fflush(stdout);
    return EXIT_SUCCESS;
}

Outcome: breaks at:

  • qwer1
  • qwer2
  • fflush. The previous printf didn't actually print anything, they were buffered! The write syacall only happened on the fflush.

Notes:

  • $bpnum thanks to Tromey at: https://sourceware.org/bugzilla/show_bug.cgi?id=18727
  • rdi: register that contains the number of the Linux system call in x86_64, 1 is for write
  • rsi: first argument of the syscall, for write it points to the buffer
  • strstr: standard C function call, searches for submatches, returns NULL if non found

Tested in Ubuntu 17.10, gdb 8.0.1.

strace

Another option if you are feeling interactive:

setarch "$(uname -m)" -R strace -i ./stdout.out |& grep '\] write'

Sample output:

[00007ffff7b00870] write(1, "a\nb\n", 4a

Now copy that address and paste it into:

setarch "$(uname -m)" -R strace -i ./stdout.out |& grep -E '\] write\(1, "a'

The advantage of this method is that you can use the usual UNIX tools to manipulate strace output, and it does not require deep GDB-fu.

Explanation:

  • -i makes strace output RIP
  • setarch -R disables ASLR for a process with a personality system call: How to debug with strace -i when everytime address is different GDB already does that by default, so no need to do it again.

Anthony's answer is awesome. Following his answer, I tried out another solution on Windows(x86-64 bits Windows). I know this question here is for GDB on Linux, however, I think this solution is a supplement for this kind of question. It might be helpful for others.

Solution on Windows

In Linux a call to printf would result in call to the API write. And because Linux is an open source OS, we could debug within the API. However, the API is different on Windows, it provided it's own API WriteFile. Due to Windows is a commercial non-open source OS, breakpoints could not be added in the APIs.

But some of the source code of VC is published together with Visual Studio, so we could find out in the source code where finally called the WriteFile API and set a breakpoint there. After debugging on the sample code, I found the printf method could result in a call to _write_nolock in which WriteFile is called. The function is located in:

your_VS_folder\VC\crt\src\write.c

The prototype is:

/* now define version that doesn't lock/unlock, validate fh */
int __cdecl _write_nolock (
        int fh,
        const void *buf,
        unsigned cnt
        )

Compared to the write API on Linux:

#include <unistd.h>

ssize_t write(int fd, const void *buf, size_t count); 

They have totally the same parameters. So we could just set a condition breakpoint in _write_nolock just refer to the solutions above, with only some differences in detail.

Portable Solution for Both Win32 and x64

It is very lucky that we could use the name of parameters directly on Visual Studio when setting a condition for breakpoints on both Win32 and x64. So it becomes very easy to write the condition:

  1. Add a breakpoints in _write_nolock

    NOTICE: There are little difference on Win32 and x64. We could just use the function name to set the location of breakpoints on Win32. However, it won't work on x64 because in the entrance of the function, the parameters is not initialized. Therefore, we could not use the parameter name to set the condition of breakpoints.

    But fortunately we have some work around: use the location in the function rather than the function name to set the breakpoints, e.g., the 1st line of the function. The parameters are already initialized there. (I mean use the filename+line number to set the breakpoints, or open the file directly and set a breakpoint in the function, not the entrance but the first line. )

  2. Restrict the condition:

    fh == 1 && strstr((char *)buf, "Hello World") != 0
    

NOTICE: there is still a problem here, I tested two different ways to write something into stdout: printf and std::cout. printf would write all the strings to the _write_nolock function at once. However std::cout would only pass character by character to _write_nolock, which means the API would be called strlen("your string") times. In this case, the condition could not be activated forever.

Win32 Solution

Of course we could use the same methods as Anthony provided: set the condition of breakpoints by registers.

For a Win32 program, the solution is almost the same with GDB on Linux. You might notice that there is a decorate __cdecl in the prototype of _write_nolock. This calling convention means:

  • Argument-passing order is Right to left.
  • Calling function pops the arguments from the stack.
  • Name-decoration convention: Underscore character (_) is prefixed to names.
  • No case translation performed.

There is a description here. And there is an example which is used to show the registers and stacks on Microsoft's website. The result could be found here.

Then it is very easy to set the condition of breakpoints:

  1. Set a breakpoint in _write_nolock.
  2. Restrict the condition:

    *(int *)($esp + 4) == 1 && strstr(*(char **)($esp + 8), "Hello") != 0
    

It is the same method as on the Linux. The first condition is to make sure the string is written to stdout. The second one is to match the specified string.

x64 Solution

Two important modification from x86 to x64 are the 64-bit addressing capability and a flat set of 16 64-bit registers for general use. As the increase of registers, x64 only use __fastcall as the calling convention. The first four integer arguments are passed in registers. Arguments five and higher are passed on the stack.

You could refer to the Parameter Passing page on Microsoft's website. The four registers (in order left to right) are RCX, RDX, R8 and R9. So it is very easy to restrict the condition:

  1. Set a breakpoint in _write_nolock.

    NOTICE: it's different from the portable solution above, we could just set the location of breakpoint to the function rather than the 1st line of the function. The reason is all the registers are already initialized at the entrance.

  2. Restrict condition:

    $rcx == 1 && strstr((char *)$rdx, "Hello") != 0
    

The reason why we need cast and dereference on esp is that $esp accesses the ESP register, and for all intents and purposes is a void*. While the registers here stores directly the values of parameters. So another level of indirection is not needed anymore.

Post

I also enjoy this question very much, so I translated Anthony's post into Chinese and put my answer in it as a supplement. The post could be found here. Thanks for @anthony-arnold 's permission.