ben-sb@home:~$

Travel the Dunes with OCaml

This post is a write up of the Travel the Dunes with OCaml challenge from the rev category of University of Maryland Capture The Flag 2024. The challenge provides a file, ocaml_executable and states that it is “a flag checker compiled and assembled from OCaml using ocamlopt”.

Loading it into Binary Ninja and jumping to the start function we see a libc entry point:

_start function

We follow this to main, which calls some functions that appear to be related to OCaml initialisation.

main function

These functions call a lot of other uninteresting functions which are just setting up the OCaml program. I won’t bore you with the details, but after some jumping about we end up at the camlMain__entry function:

camlMain__entry function

This function looks interesting. It first calls printf (or the OCaml equivalent) with some values from the data section of the program. Examining the according entries in tells us it prints

===============================================
OCaml Flag Checker (v 1.0.0)
===============================================
Make a guess:

It then calls camlStdlib__input_line_323, which, as the name suggests, reads a line from stdin. Next it performs some bitwise operations on the result, then depending on the result of these operations, it either calls camlMain__check_input_306 with the provided input, or prints something and exits. Checking the data section once again tells us that the else branch prints Incorrect length!. This suggests that the bitwise operations were calculating the length of the input string, and we can deduce the correct length to be 123 characters, since the result is doubled then compared against 0xf6 = 246.

The camlMain__check_input_306 sounds promising:

camlMain__check_input_306 function

It has a loop with a counter arg2 which, if we check the arguments in the call of the function, starts at 1. The important logic is the comparison rax + rax + 1 == rax_3, which determines whether it prints something then continues with the loop, or prints something then exits. Checking the data section tells us that it first prints “Make a guess:”, then if the condition is true it prints “- Match!”. Otherwise it prints “- Incorrect Match! Terminating program.” and exits.

Examining the comparison, we see that the variable rax is computed as zx.q(*(arg1 + (arg2 s>> 1))). Consulting the Binary Ninja IL Docs tells us that s>> is simply a signed shift and zx.q means convert the operand to a quadword, filling the extended bits with zeros. So this expression is just *(arg1 + (arg2 / 2)). This makes sense as arg1 was the input provided and arg2 is the loop counter which is incremented by 2 at the end of each iteration, so rax will hold the nth char code of the input.

The other variable used in the comparison, rax_3, is the result of the call to camlStdlib__List__nth_289, which sounds like it will access the nth element of some list. The program is looking for rax + rax + 1 == rax_3, so we want (2 * flag[n]) + 1 to match the nth element of this list. After spending some time statically analysing the list functions I couldn’t find the list being referred to, however we now have enough information to solve the problem dynamically.

Dynamic analysis

We can debug the program with GDB, add a breakpoint on the comparison then just read out the correct flag character. Looking at the relevant part of the disassembly (rather than the high level IL) will aid us to find the exact address to break on:

camlMain__check_input_306 disassembly

The program stores rax + rax + 1 in rsp + 0x10, and stores the nth item of the list in rsp + 0x8. Later it moves rsp + 0x8 into rax and rsp + 0x10 into rbx, before comparing rax and rbx and jumping based on the result. Putting a breakpoint on address 0x3ffc4 (the cmp instruction) will allow us to read the correct flag char code out of rax.

However if we try to add a breakpoint at 0x3ffc4 we will get an error, as this is address is an offset from the base address of the program. To find the base address we can debug the program with gdb and run start, which will run and break on the start function. Then we run info proc mappings to view the address space assigned to the program.

process address space

The start address of the first entry is the base address, 0x555555554000 in this case. We can then compute the desired breakpoint address as 0x555555554000 + 0x3ffc4.

We add this breakpoint and allow the program to run, and it shortly asks us for a guess at the flag. Remembering that the program validates the length is 123, we can provide it with any 123 character string. Once it hits the breakpoint we can read the value of rax and convert it back to a char code via (rax - 1) / 2. For the first character this gives us the character U. I repeated this process to obtain the rest of the flag:

UMDCTF{rvgxbvhm89tc83oc3t84mo90m83dt4m_s3R10Us1YwhO@ctu@11y_usEs_0c@m1???_al7of37d3otdxlsdt8nysfln8y43fsg7xuvnsf}

An automated solution

It would be much nicer if we could automate the debugging process and didn’t have to manually read the rax register for each character. To do this we can write a Python script using the gdb API.

As noted before, the transformed character of the input we provided, rax + rax + 1 is stored in rbx during our comparison. Therefore we can breakpoint as normal, read the correct value out of rax and store it in rbx to make the program think we were correct, and continue onto the next character. This looks like:

code = gdb.parse_and_eval('$rax')
gdb.execute(f'set $rbx = {code}')
gdb.execute('continue')

As before we can provide the provide the program with any 123 length string. The full script is

import gdb

flag_chars = []

def handle_brk(e):
    if isinstance(e, gdb.BreakpointEvent):
        code = gdb.parse_and_eval('$rax')
        gdb.execute(f'set $rbx = {code}')
        flag_chars.append(chr((code - 1)/2))
        gdb.execute('continue')

gdb.events.stop.connect(handle_brk)

# base address + offset
addr = 0x555555554000 + 0x3ffc4
gdb.Breakpoint(f'*{addr}')

# random guess at the flag
input = 'a' * 123
gdb.execute(f'run < <(echo "{input}")')

print(''.join(flag_chars))

Running gdb with this script loaded (via naming it ocaml_executable-gdb.py or creating a .gdbinit file with source script.py) results in all 123 characters being validated successfully and the correct flag printed.

Epilogue

The source code for the challenge can be found here.