Template Virtualisation Obfuscation
Today I’ll be discussing an interesting challenge from the rev category of University of Maryland Capture The Flag 2024. The challenge is named Typecheck and is accompanied by the cryptic description “My C++ code won’t type check. Can you fix that for me?”.
Two files are provided: main.cpp
and templates.cpp
.
The main file looks like:
#include "templates.hpp"
// flag is 60 chars
using flag_t = int_list_t <'f','l','a','g','g','o','e','s','h','e','r','e'>;
using prog_t = int_list_t <2, 0, 3, 2, 2, 47, 1, 0, 3, 12, 2, /* many more */ >;
int main() {
vm_t<nil_t, prog_t, flag_t> b;
b = (nil_t)b;
}
The variable names suggest that vm_t
is a virtual machine, prog_t
is the VM’s bytecode and flag_t
is the flag to be checked. However these are types, which is strange.
templates.cpp
contains templates for these types, as well as many more. The vm_t
template refers to a struct template vm
, which consists of a base template with an empty struct and several specialised templates.
template <typename S, typename IT, typename In>
struct vm;
template <typename S, typename In>
struct vm<S, nil_t, In> {
using type = S;
};
template <typename S, typename R, typename In>
struct vm<S, cons<V<0>, R>, In> {
using type = typename vm<A_t<S>, R, In>::type;
};
template <typename S, typename R, typename In>
struct vm<S, cons<V<1>, R>, In> {
using type = typename vm<M_t<S>, R, In>::type;
};
template <typename S, int PV, typename R, typename In>
struct vm<S, cons<V<2>, cons<V<PV>, R>>, In> {
using type = typename vm<P_t<PV, S>, R, In>::type;
};
template <typename S, int N, typename R, typename In>
struct vm<S, cons<V<3>, cons<V<N>, R>>, In> {
using type = typename vm<cons<V<g<N, In>::value>, S>, R, In>::type;
};
template <typename S, int PV, typename R, typename In>
struct vm<S, cons<V<4>, cons<V<PV>, R>>, In> {
using type = typename vm<T_t<PV, S>, R, In>::type;
};
template <typename S, typename IT, typename In>
using vm_t = typename vm<S, IT, In>::type;
These look like recursive/nested templates which refer to each other. There are multiple specialised versions of the vm
template, so the one which fits best will be chosen.
To tell which of them this will be we need to look at the definitions of cons
and V
.
template<class T, class U>
struct cons {
using car = T;
using cdr = U;
};
template <int v> struct V { static const constexpr int value = v ; };
So cons
is a struct with two fields, and V
is a struct just containing an int value. These templates are essentially just wrappers around two consecutive values and an integer value respectively.
Now looking at one of the specialised vm
templates:
template <typename S, typename R, typename In>
struct vm<S, cons<V<0>, R>, In> {
using type = typename vm<A_t<S>, R, In>::type;
};
We can tell that this will match a template with the first parameter being some unconstrained type S
, and the second being a cons
struct template with the first element being the int 0 and the second being the unconstrained type R
. The last template parameter is another unconstrained type In
.
Recalling that the second parameter to vm_t
in main.cpp
was the bytecode, we can deduce that this template is matched when the bytecode starts with 0, followed by anything else. This suggests that 0 is our opcode, but what is the operation?
Looking at the using-declaration, we can see this template forwards us on to another vm
template, but changes some of the parameters. Specifically it removes 0 from the start of the bytecode parameter, and replaces S
with A_t<S>
.
The declaration of A_t
is:
template <typename S>
struct A;
template <int a, int b, typename rest>
struct A<cons<V<a>, cons<V<b>, rest>>> {
using type = cons<V<a + b>, rest>;
};
template <typename S>
using A_t = typename A<S>::type;
This tells us that it takes the first two elements of S
and replaces them with their sum. We can now deduce that this is the add
operation, and S
is the VM’s stack. Note that the first element of S
is actually the top of the stack.
We can do the same analysis for the rest of the specialised vm
templates. The only slightly different one is opcode 3, which recursively accesses characters of the provided flag until it reaches the nth/desired character, and pushes it to the stack. We get the following instruction set:
Opcode | Instruction | Description |
---|---|---|
0 | add | Adds the top two elements of the stack |
1 | mul | Multiplies the top two elements of the stack |
2 | push | Pushes the next value from the bytecode onto the stack |
3 | push_flag_char | With the next value from the bytecode as n, pushes the nth character of the flag to the stack |
4 | cmp | Tests if the top of the stack is equal to the next value from the bytecode |
So a very simple architecture with 5 opcodes.
Disassembler
Now we understand the VM, we can write a basic disassembler. To do this we just read through the bytecode as the VM would, and keep track of all instructions.
with open('bytecode.txt', 'r') as f:
bytecode = [int(x) for x in f.read().split(', ')]
disassembly = []
while len(bytecode):
opcode = bytecode.pop(0)
if opcode == 0:
disassembly.append('add\n')
elif opcode == 1:
disassembly.append('mul\n')
elif opcode == 2:
operand = bytecode.pop(0)
disassembly.append(f'push {operand}\n')
elif opcode == 3:
operand = bytecode.pop(0)
disassembly.append(f'push_flag_char {operand}\n')
elif opcode == 4:
operand = bytecode.pop(0)
disassembly.append(f'cmp {operand}\n')
else:
raise Exception(f'Unknown opcode {opcode}')
with open('disassembly.txt', 'w') as f:
f.writelines(disassembly)
This produces disassembly which looks like:
push 0
push_flag_char 2
push 47
mul
add
push_flag_char 12
push 24
mul
add
push_flag_char 16
push 67
mul
add
push_flag_char 18
push 89
mul
add
push_flag_char 22
push 59
mul
; 2000+ more lines
This is good progress. Analysing the disassembly indicates that the VM creates a complicated expression involving adding and multiplying several characters of the flag and other numbers, then uses the cmp
instruction to compare it to a fixed value from the bytecode. This pattern then repeats for the remainder of the program, giving us a series of equations. For example the first equation is
40701 == (24 * flag[12]) + (67 * flag[16]) + (89 * flag[18]) + (47 * flag[2]) + (59 * flag[22]) + (61 * flag[41]) + (19 * flag[51]) + (45 * flag[56])
Clearly we do not have enough information to solve this equation on its own, so we will need to solve them all together. We can use a math library like sympy
, or a theorem prover like z3
, to do so.
To build each equation we will extend our disassembler to keep track of the stack during each instruction, and represent the values we don’t know (the flag characters) as symbols.
from sympy import symbols, Eq, solve
with open('bytecode.txt', 'r') as f:
bytecode = [int(x) for x in f.read().split(', ')]
disassembly = []
stack = []
# create 60 symbols, one for each flag character
flag_symbols = symbols('f0:60')
while len(bytecode):
opcode = bytecode.pop(0)
if opcode == 0:
sym1 = stack.pop(0)
sym2 = stack.pop(0)
stack.insert(0, sym1 + sym2)
disassembly.append('add\n')
elif opcode == 1:
sym1 = stack.pop(0)
sym2 = stack.pop(0)
stack.insert(0, sym1 * sym2)
disassembly.append('mul\n')
elif opcode == 2:
operand = bytecode.pop(0)
stack.insert(0, operand)
disassembly.append(f'push {operand}\n')
elif opcode == 3:
operand = bytecode.pop(0)
stack.insert(0, flag_symbols[operand])
disassembly.append(f'push_flag_char {operand}\n')
elif opcode == 4:
operand = bytecode.pop(0)
sym = stack.pop(0)
stack.insert(0, Eq(operand, sym))
disassembly.append(f'cmp {operand}\n')
else:
raise Exception(f'Unknown opcode {opcode}')
with open('disassembly.txt', 'w') as f:
f.writelines(disassembly)
# solve the equations
sol = solve(stack)
# build the flag
flag = ''
for sym in flag_symbols:
code = sol[sym]
flag += chr(code)
print(flag)
At the end of the program, the stack contains the 60 equations created. We have 60 unknowns, so we can just solve them directly as simultaneous equations. This prints the flag
UMDCTF{c++_templates_are_the_reason_for_the_butlerian_jihad}
Epilogue
The source code for the challenge can be found here.