How can I monitor what's being put into the standard out buffer and break when a specific string is deposited in the pipe?
This question might be a good starting point: how can I put a breakpoint on "something is printed to the terminal" in gdb?
So you could at least break whenever something is written to stdout. The method basically involves setting a breakpoint on the write
syscall with a condition that the first argument is 1
(i.e. STDOUT). In the comments, there is also a hint as to how you could inspect the string parameter of the write
call as well.
x86 32-bit mode
I came up with the following and tested it with gdb 7.0.1-debian. It seems to work quite well. $esp + 8
contains a pointer to the memory location of the string passed to write
, so first you cast it to an integral, then to a pointer to char
. $esp + 4
contains the file descriptor to write to (1 for STDOUT).
$ gdb break write if 1 == *(int*)($esp + 4) && strcmp((char*)*(int*)($esp + 8), "your string") == 0
x86 64-bit mode
If your process is running in x86-64 mode, then the parameters are passed through scratch registers %rdi
and %rsi
$ gdb break write if 1 == $rdi && strcmp((char*)($rsi), "your string") == 0
Note that one level of indirection is removed since we're using scratch registers rather than variables on the stack.
Variants
Functions other than strcmp
can be used in the above snippets:
strncmp
is useful if you want match the firstn
number of characters of the string being writtenstrstr
can be used to find matches within a string, since you can't always be certain that the string you're looking for is at the beginning of string being written through thewrite
function.
Edit: I enjoyed this question and finding it's subsequent answer. I decided to do a blog post about it.
catch
+ strstr
condition
The cool thing about this method is that it does not depend on glibc write
being used: it traces the actual system call.
Furthermore, it is more resilient to printf()
buffering, as it might even catch strings that are printed across multiple printf()
calls.
x86_64 version:
define stdout
catch syscall write
commands
printf "rsi = %s\n", $rsi
bt
end
condition $bpnum $rdi == 1 && strstr((char *)$rsi, "$arg0") != NULL
end
stdout qwer
Test program:
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
write(STDOUT_FILENO, "asdf1", 5);
write(STDOUT_FILENO, "qwer1", 5);
write(STDOUT_FILENO, "zxcv1", 5);
write(STDOUT_FILENO, "qwer2", 5);
printf("as");
printf("df");
printf("qw");
printf("er");
printf("zx");
printf("cv");
fflush(stdout);
return EXIT_SUCCESS;
}
Outcome: breaks at:
qwer1
qwer2
fflush
. The previousprintf
didn't actually print anything, they were buffered! Thewrite
syacall only happened on thefflush
.
Notes:
$bpnum
thanks to Tromey at: https://sourceware.org/bugzilla/show_bug.cgi?id=18727rdi
: register that contains the number of the Linux system call in x86_64,1
is forwrite
rsi
: first argument of the syscall, forwrite
it points to the bufferstrstr
: standard C function call, searches for submatches, returns NULL if non found
Tested in Ubuntu 17.10, gdb 8.0.1.
strace
Another option if you are feeling interactive:
setarch "$(uname -m)" -R strace -i ./stdout.out |& grep '\] write'
Sample output:
[00007ffff7b00870] write(1, "a\nb\n", 4a
Now copy that address and paste it into:
setarch "$(uname -m)" -R strace -i ./stdout.out |& grep -E '\] write\(1, "a'
The advantage of this method is that you can use the usual UNIX tools to manipulate strace
output, and it does not require deep GDB-fu.
Explanation:
-i
makes strace output RIPsetarch -R
disables ASLR for a process with apersonality
system call: How to debug with strace -i when everytime address is different GDB already does that by default, so no need to do it again.
Anthony's answer is awesome. Following his answer, I tried out another solution on Windows(x86-64 bits Windows). I know this question here is for GDB on Linux, however, I think this solution is a supplement for this kind of question. It might be helpful for others.
Solution on Windows
In Linux a call to printf
would result in call to the API write
. And because Linux is an open source OS, we could debug within the API. However, the API is different on Windows, it provided it's own API WriteFile. Due to Windows is a commercial non-open source OS, breakpoints could not be added in the APIs.
But some of the source code of VC is published together with Visual Studio, so we could find out in the source code where finally called the WriteFile
API and set a breakpoint there. After debugging on the sample code, I found the printf
method could result in a call to _write_nolock
in which WriteFile
is called. The function is located in:
your_VS_folder\VC\crt\src\write.c
The prototype is:
/* now define version that doesn't lock/unlock, validate fh */
int __cdecl _write_nolock (
int fh,
const void *buf,
unsigned cnt
)
Compared to the write
API on Linux:
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
They have totally the same parameters. So we could just set a condition breakpoint
in _write_nolock
just refer to the solutions above, with only some differences in detail.
Portable Solution for Both Win32 and x64
It is very lucky that we could use the name of parameters directly on Visual Studio when setting a condition for breakpoints on both Win32 and x64. So it becomes very easy to write the condition:
Add a breakpoints in
_write_nolock
NOTICE: There are little difference on Win32 and x64. We could just use the function name to set the location of breakpoints on Win32. However, it won't work on x64 because in the entrance of the function, the parameters is not initialized. Therefore, we could not use the parameter name to set the condition of breakpoints.
But fortunately we have some work around: use the location in the function rather than the function name to set the breakpoints, e.g., the 1st line of the function. The parameters are already initialized there. (I mean use the
filename+line number
to set the breakpoints, or open the file directly and set a breakpoint in the function, not the entrance but the first line. )Restrict the condition:
fh == 1 && strstr((char *)buf, "Hello World") != 0
NOTICE: there is still a problem here, I tested two different ways to write something into stdout: printf
and std::cout
. printf
would write all the strings to the _write_nolock
function at once. However std::cout
would only pass character by character to _write_nolock
, which means the API would be called strlen("your string")
times. In this case, the condition could not be activated forever.
Win32 Solution
Of course we could use the same methods as Anthony
provided: set the condition of breakpoints by registers.
For a Win32 program, the solution is almost the same with GDB
on Linux. You might notice that there is a decorate __cdecl
in the prototype of _write_nolock
. This calling convention means:
- Argument-passing order is Right to left.
- Calling function pops the arguments from the stack.
- Name-decoration convention: Underscore character (_) is prefixed to names.
- No case translation performed.
There is a description here. And there is an example which is used to show the registers and stacks on Microsoft's website. The result could be found here.
Then it is very easy to set the condition of breakpoints:
- Set a breakpoint in
_write_nolock
. Restrict the condition:
*(int *)($esp + 4) == 1 && strstr(*(char **)($esp + 8), "Hello") != 0
It is the same method as on the Linux. The first condition is to make sure the string is written to stdout
. The second one is to match the specified string.
x64 Solution
Two important modification from x86 to x64 are the 64-bit addressing capability and a flat set of 16 64-bit registers for general use. As the increase of registers, x64 only use __fastcall
as the calling convention. The first four integer arguments are passed in registers. Arguments five and higher are passed on the stack.
You could refer to the Parameter Passing page on Microsoft's website. The four registers (in order left to right) are RCX
, RDX
, R8
and R9
. So it is very easy to restrict the condition:
Set a breakpoint in
_write_nolock
.NOTICE: it's different from the portable solution above, we could just set the location of breakpoint to the function rather than the 1st line of the function. The reason is all the registers are already initialized at the entrance.
Restrict condition:
$rcx == 1 && strstr((char *)$rdx, "Hello") != 0
The reason why we need cast and dereference on esp
is that $esp
accesses the ESP
register, and for all intents and purposes is a void*
. While the registers here stores directly the values of parameters. So another level of indirection is not needed anymore.
Post
I also enjoy this question very much, so I translated Anthony's post into Chinese and put my answer in it as a supplement. The post could be found here. Thanks for @anthony-arnold 's permission.