Difficulty understanding logic in disassembled binary bomb phase 3
The function makes a modified copy of a string from static storage, into a malloced buffer.
This looks weird. The malloc
size is dependent on strlen
+1, but the memcpy
size is a compile-time constant? Your decompilation apparently shows that address was a string literal so it seems that's fine.
Probably that missed optimization happened because of a custom string_length()
function that was maybe only defined in another .c
(and the bomb was compiled without link-time optimization for cross-file inlining). So size_t len = string_length("some string literal");
is not a compile-time constant and the compiler emitted a call to it instead of being able to use the known constant length of the string.
But probably they used strcpy
in the source and the compiler did inline that as a rep movs
. Since it's apparently copying from a string literal, the length is a compile-time constant and it can optimize away that part of the work that strcpy
normally has to do. Normally if you've already calculated the length it's better to use memcpy
instead of making strcpy
calculate it again on the fly, but in this case it actually helped the compiler make better code for that part than if they'd passed the return value of string_length
to a memcpy
, again because string_length
couldn't inline and optimize away.
<+0>: push %edi // push value in edi to stack
<+1>: push %esi // push value of esi to stack
<+2>: sub $0x14,%esp // grow stack by 0x14 (move stack ptr -0x14 bytes)
Comments like that are redundant; the instruction itself already says that. This is saving two call-preserved registers so the function can use them internally and restore them later.
Your comment on the sub
is better; yes, grow the stack is the higher level semantic meaning here. This function reserves some space for locals (and for function args to be stored with mov
instead of push
ed).
The rep movsd
copies 0x13 * 4 bytes, incrementing ESI and EDI to point past the end of the copied region. So another movsd
instruction would copy another 4 bytes contiguous with the previous copy.
The code actually copies another 2, but instead of using movsw
, it uses a movzw
word load and a mov
store. This makes a total of 78 bytes copied.
...
# at this point EAX = malloc return value which I'll call buf
<+28>: mov $0x804a388,%esi # copy src = a string literal in .rodata?
<+33>: mov $0x13,%ecx
<+38>: mov %eax,%edi # copy dst = buf
<+40>: rep movsl %ds:(%esi),%es:(%edi) # memcpy 76 bytes and advance ESI, EDI
<+42>: movzwl (%esi),%edx
<+45>: mov %dx,(%edi) # copy another 2 bytes (not moving ESI or EDI)
# final effect: 78-byte memcpy
On some (but not all) CPUs it would have been efficient to just use rep movsb
or rep movsw
with appropriate counts, but that's not what the compiler chose in this case. movzx
aka AT&T movz
is a good way to do narrow loads without partial-register penalties. That's why compilers do it, so they can write a full register even though they're only going to read the low 8 or 16 bits of that reg with a store instruction.
After that copy of a string literal into buf, we have a byte load/store that copies a character with buf
. Remember at this point EAX is still pointing at buf
, the malloc
return value. So it's making a modified copy of the string literal.
<+48>: movzbl 0x11(%eax),%edx
<+52>: mov %dl,0x10(%eax) # buf[16] = buf[17]
Perhaps if the source hadn't defeated constant-propagation, with high enough optimization level the compiler might have just put the final string into .rodata
where you could find it, trivializing this bomb phase. :P
Then it stores pointers as stack args for string compare.
<+55>: mov %eax,0x4(%esp) # 2nd arg slot = EAX = buf
<+59>: mov 0x20(%esp),%eax # function arg = user input?
<+63>: mov %eax,(%esp) # first arg slot = our incoming stack arg
<+66>: call 0x80490ca <strings_not_equal>
How to "cheat": looking at the runtime result with GDB
Some bomb labs only let you run the bomb online, on a test server, which would record explosions. You couldn't run it under GDB, only use static disassembly (like objdump -drwC -Mintel
). So the test server could record how many failed attempts you had. e.g. like CS 3330 at cs.virginia.edu that I found with google, where full credit requires less than 20 explosions.
Using GDB to examine memory / registers part way through a function makes this vastly easier than only working from static analysis, in fact trivializing this function where the single input is only checked at the very end. e.g. just look at what other arg is being passed to strings_not_equal
. (Especially if you use GDB's jump
or set $pc = ...
commands to skip past the bomb explosion checks.)
Set a breakpoint or single-step to just before the call to strings_not_equal
. Use p (char*)$eax
to treat EAX as a char*
and show you the (0-terminated) C string starting at that address. At that point EAX holds the address of the buffer, as you can see from the store to the stack.
Copy/paste that string result and you're done.
Other phases with multiple numeric inputs typically aren't this easy to cheese with a debugger and do require at least some math, but linked-list phases that requires you to have a sequence of numbers in the right order for list traversal also become trivial if you know how to use a debugger to set registers to make compares succeed as you get to them.
rep movsl
copies 32-bit longwords from address %esi
to address %edi
, incrementing both by 4 each time, a number of times equal to %ecx
. Think of it as memcpy(edi, esi, ecx*4)
.
See https://felixcloutier.com/x86/movs:movsb:movsw:movsd:movsq (it's movsd in Intel notation).
So this is copying 19*4=76
bytes.