ben-sb@home:~$

Template Virtualisation Obfuscation

Today I’ll be discussing an interesting challenge from the rev category of University of Maryland Capture The Flag 2024. The challenge is named Typecheck and is accompanied by the cryptic description “My C++ code won’t type check. Can you fix that for me?”.

Two files are provided: main.cpp and templates.cpp. The main file looks like:

#include "templates.hpp"

// flag is 60 chars
using flag_t = int_list_t <'f','l','a','g','g','o','e','s','h','e','r','e'>;
using prog_t = int_list_t <2, 0, 3, 2, 2, 47, 1, 0, 3, 12, 2, /* many more */ >;

int main() {
    vm_t<nil_t, prog_t, flag_t> b;

    b = (nil_t)b;
}

The variable names suggest that vm_t is a virtual machine, prog_t is the VM’s bytecode and flag_t is the flag to be checked. However these are types, which is strange.

templates.cpp contains templates for these types, as well as many more. The vm_t template refers to a struct template vm, which consists of a base template with an empty struct and several specialised templates.

template <typename S, typename IT, typename In>
struct vm;

template <typename S, typename In>
struct vm<S, nil_t, In> {
    using type = S;
};

template <typename S, typename R, typename In>
struct vm<S, cons<V<0>, R>, In>  {
    using type = typename vm<A_t<S>, R, In>::type;
};

template <typename S, typename R, typename In>
struct vm<S, cons<V<1>, R>, In>  {
    using type = typename vm<M_t<S>, R, In>::type;
};

template <typename S, int PV, typename R, typename In>
struct vm<S, cons<V<2>, cons<V<PV>, R>>, In>  {
    using type = typename vm<P_t<PV, S>, R, In>::type;
};

template <typename S, int N, typename R, typename In>
struct vm<S, cons<V<3>, cons<V<N>, R>>, In> {
    using type = typename vm<cons<V<g<N, In>::value>, S>, R, In>::type;
};

template <typename S, int PV, typename R, typename In>
struct vm<S, cons<V<4>, cons<V<PV>, R>>, In>  {
    using type = typename vm<T_t<PV, S>, R, In>::type;
};

template <typename S, typename IT, typename In>
using vm_t = typename vm<S, IT, In>::type;

These look like recursive/nested templates which refer to each other. There are multiple specialised versions of the vm template, so the one which fits best will be chosen.

To tell which of them this will be we need to look at the definitions of cons and V.

template<class T, class U>
struct cons {
    using car = T;
    using cdr = U;
};

template <int v> struct V { static const constexpr int value = v ; };

So cons is a struct with two fields, and V is a struct just containing an int value. These templates are essentially just wrappers around two consecutive values and an integer value respectively.

Now looking at one of the specialised vm templates:

template <typename S, typename R, typename In>
struct vm<S, cons<V<0>, R>, In>  {
    using type = typename vm<A_t<S>, R, In>::type;
};

We can tell that this will match a template with the first parameter being some unconstrained type S, and the second being a cons struct template with the first element being the int 0 and the second being the unconstrained type R. The last template parameter is another unconstrained type In.

Recalling that the second parameter to vm_t in main.cpp was the bytecode, we can deduce that this template is matched when the bytecode starts with 0, followed by anything else. This suggests that 0 is our opcode, but what is the operation?

Looking at the using-declaration, we can see this template forwards us on to another vm template, but changes some of the parameters. Specifically it removes 0 from the start of the bytecode parameter, and replaces S with A_t<S>.

The declaration of A_t is:

template <typename S>
struct A;

template <int a, int b, typename rest>
struct A<cons<V<a>, cons<V<b>, rest>>> {
    using type = cons<V<a + b>, rest>;
};

template <typename S>
using A_t = typename A<S>::type;

This tells us that it takes the first two elements of S and replaces them with their sum. We can now deduce that this is the add operation, and S is the VM’s stack. Note that the first element of S is actually the top of the stack.

We can do the same analysis for the rest of the specialised vm templates. The only slightly different one is opcode 3, which recursively accesses characters of the provided flag until it reaches the nth/desired character, and pushes it to the stack. We get the following instruction set:

Opcode Instruction Description
0 add Adds the top two elements of the stack
1 mul Multiplies the top two elements of the stack
2 push Pushes the next value from the bytecode onto the stack
3 push_flag_char With the next value from the bytecode as n, pushes the nth character of the flag to the stack
4 cmp Tests if the top of the stack is equal to the next value from the bytecode

So a very simple architecture with 5 opcodes.

Disassembler

Now we understand the VM, we can write a basic disassembler. To do this we just read through the bytecode as the VM would, and keep track of all instructions.

with open('bytecode.txt', 'r') as f:
    bytecode = [int(x) for x in f.read().split(', ')]

disassembly = []

while len(bytecode):
    opcode = bytecode.pop(0)
    if opcode == 0:
        disassembly.append('add\n')
    elif opcode == 1:
        disassembly.append('mul\n')
    elif opcode == 2:
        operand = bytecode.pop(0)
        disassembly.append(f'push {operand}\n')
    elif opcode == 3:
        operand = bytecode.pop(0)
        disassembly.append(f'push_flag_char {operand}\n')
    elif opcode == 4:
        operand = bytecode.pop(0)
        disassembly.append(f'cmp {operand}\n')
    else:
        raise Exception(f'Unknown opcode {opcode}')

with open('disassembly.txt', 'w') as f:
    f.writelines(disassembly)

This produces disassembly which looks like:

push 0
push_flag_char 2
push 47
mul
add
push_flag_char 12
push 24
mul
add
push_flag_char 16
push 67
mul
add
push_flag_char 18
push 89
mul
add
push_flag_char 22
push 59
mul
; 2000+ more lines

This is good progress. Analysing the disassembly indicates that the VM creates a complicated expression involving adding and multiplying several characters of the flag and other numbers, then uses the cmp instruction to compare it to a fixed value from the bytecode. This pattern then repeats for the remainder of the program, giving us a series of equations. For example the first equation is

40701 == (24 * flag[12]) + (67 * flag[16]) + (89 * flag[18]) + (47 * flag[2]) + (59 * flag[22]) + (61 * flag[41]) + (19 * flag[51]) + (45 * flag[56])

Clearly we do not have enough information to solve this equation on its own, so we will need to solve them all together. We can use a math library like sympy, or a theorem prover like z3, to do so.

To build each equation we will extend our disassembler to keep track of the stack during each instruction, and represent the values we don’t know (the flag characters) as symbols.

from sympy import symbols, Eq, solve

with open('bytecode.txt', 'r') as f:
    bytecode = [int(x) for x in f.read().split(', ')]

disassembly = []
stack = []

# create 60 symbols, one for each flag character
flag_symbols = symbols('f0:60')

while len(bytecode):
    opcode = bytecode.pop(0)
    if opcode == 0:
        sym1 = stack.pop(0)
        sym2 = stack.pop(0)
        stack.insert(0, sym1 + sym2)
        disassembly.append('add\n')
    elif opcode == 1:
        sym1 = stack.pop(0)
        sym2 = stack.pop(0)
        stack.insert(0, sym1 * sym2)
        disassembly.append('mul\n')
    elif opcode == 2:
        operand = bytecode.pop(0)
        stack.insert(0, operand)
        disassembly.append(f'push {operand}\n')
    elif opcode == 3:
        operand = bytecode.pop(0)
        stack.insert(0, flag_symbols[operand])
        disassembly.append(f'push_flag_char {operand}\n')
    elif opcode == 4:
        operand = bytecode.pop(0)
        sym = stack.pop(0)
        stack.insert(0, Eq(operand, sym))
        disassembly.append(f'cmp {operand}\n')
    else:
        raise Exception(f'Unknown opcode {opcode}')

with open('disassembly.txt', 'w') as f:
    f.writelines(disassembly)

# solve the equations
sol = solve(stack)

# build the flag
flag = ''
for sym in flag_symbols:
    code = sol[sym]
    flag += chr(code)
print(flag)

At the end of the program, the stack contains the 60 equations created. We have 60 unknowns, so we can just solve them directly as simultaneous equations. This prints the flag

UMDCTF{c++_templates_are_the_reason_for_the_butlerian_jihad}

Epilogue

The source code for the challenge can be found here.