What does the bracket in `movl (%eax), %eax` mean?
They're move instructions, moving data from one place to another - in these cases, from memory into a register:
register_eax = *(unsigned long *)register_eax;
Your other example is something like:
register_eax = *(unsigned long *)(register_ebp + 8);
http://web.archive.org/web/20080215230650/http://sig9.com/articles/att-syntax is quick introduction into Unix (AT&T) asm syntax. Googled by at&t asm syntax
.
The post is "AT&T Assembly Syntax" by vivek (http://web.archive.org/web/20080228052132/http://sig9.com/blog/vivek), 2003-09-01. There is main info about AT&T from it:
For example, the general format of a basic data movement instruction in INTEL-syntax is,
mnemonic destination, source
whereas, in the case of AT&T, the general format is
mnemonic source, destination
(I remember this order as calling AT&T asm as genuine Unix asm, so it is right one, and it flows data to the right; while Intel syntax was based on some incorrect masms doc, which are clearly not right for Unix world, they are left and data flows to the left)
All register names of the IA-32 architecture must be prefixed by a '%' sign, eg. %al,%bx, %ds, %cr0 etc.
All literal values must be prefixed by a '$' sign. For example,
mov $100, %bx mov $A, %al
The first instruction moves the the value 100 into the register AX and the second one moves the numerical value of the ascii A into the AL register.
In the AT&T Syntax, memory is referenced in the following way,
segment-override:signed-offset(base,index,scale)
parts of which can be omitted depending on the address you want.> %es:100(%eax,%ebx,2)
Please note that the offsets and the scale should not be prefixed by '$'. A few more examples with their equivalent NASM-syntax, should make things clearer,
GAS memory operand NASM memory operand ------------------ ------------------- 100 [100] %es:100 [es:100] (%eax) [eax] (%eax,%ebx) [eax+ebx] (%ecx,%ebx,2) [ecx+ebx*2] (,%ebx,2) [ebx*2] -10(%eax) [eax-10] %ds:-10(%ebp) [ds:ebp-10] Example instructions, mov %ax, 100 mov %eax, -100(%eax)
Operand Sizes. At times, especially when moving literal values to memory, it becomes neccessary to specify the size-of-transfer or the operand-size. For example the instruction,
mov $10, 100
only specfies that the value 10 is to be moved to the memory offset 100, but not the transfer size. In NASM this is done by adding the casting keyword byte/word/dword etc. to any of the operands. In AT&T syntax, this is done by adding a suffix - b/w/l - to the instruction. For example,
movb $10, %es:(%eax)
moves a byte value 10 to the memory location [ea:eax], whereas,
movl $10, %es:(%eax)
moves a long value (dword) 10 to the same place.
The jmp, call, ret, etc., instructions transfer the control from one part of a program to another. They can be classified as control transfers to the same code segment (near) or to different code segments (far). The possible types of branch addressing are - relative offset (label), register, memory operand, and segment-offset pointers.
Relative offsets, are specified using labels, as shown below.
label1: . . jmp label1
Branch addressing using registers or memory operands must be prefixed by a '*'. To specify a "far" control tranfers, a 'l' must be prefixed, as in 'ljmp', 'lcall', etc. For example,
GAS syntax NASM syntax ========== =========== jmp *100 jmp near [100] call *100 call near [100] jmp *%eax jmp near eax jmp *%ecx call near ecx jmp *(%eax) jmp near [eax] call *(%ebx) call near [ebx] ljmp *100 jmp far [100] lcall *100 call far [100] ljmp *(%eax) jmp far [eax] lcall *(%ebx) call far [ebx] ret retn lret retf lret $0x100 retf 0x100
Segment-offset pointers are specified using the following format:
jmp $segment, $offset
He also recommends gnu as (gas) docs: http://web.archive.org/web/20080313132324/http://sourceware.org/binutils/docs-2.16/as/index.html
%eax
is register EAX; (%eax)
is the memory location whose address is contained in the register EAX; 8(%eax)
is the memory location whose address is the value of EAX plus 8.