Thursday, December 11, 2014

picoCTF 2014: Baleful (re200) Part 2

Welcome to the second part of the Baleful writeup. In the first part, we started reverse engineering Baleful and figuring out how it worked. We eventually deduced that it was a VM that ran an embedded byte code program. By stepping through the VM in GDB, we were able to learn how the registers and stack worked in the VM. Now that we have this knowledge, it's easy to decipher the rest of the instructions. Let's get to it!

We know that the stack pointer is located in [ebp-0x38], so other instructions using that would also be doing something with the stack. Opcodes 0x1e and 0x1f both do, meaning it'd be good to take a look at them. First let's see 0x1e:

 .text:08049B92 loc_8049B92:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049B92                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049B92         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 30  
 .text:08049B95         add   eax, 1  
 .text:08049B98         movzx  eax, byte_804C0C0[eax]  
 .text:08049B9F         movsx  eax, al  
 .text:08049BA2         mov   [ebp+var_C], eax  
 .text:08049BA5         cmp   [ebp+var_C], 0  
 .text:08049BA9         jz   short loc_8049BC1  
 .text:08049BAB         mov   eax, [ebp+ipos]  
 .text:08049BAE         add   eax, 2  
 .text:08049BB1         add   eax, offset byte_804C0C0  
 .text:08049BB6         mov   eax, [eax]  
 .text:08049BB8         mov   [ebp+var_24], eax  
 .text:08049BBB         add   [ebp+ipos], 6  
 .text:08049BBF         jmp   short loc_8049BDF  
 .text:08049BC1 ; ---------------------------------------------------------------------------  
 .text:08049BC1  
 .text:08049BC1 loc_8049BC1:              ; CODE XREF: sub_804898B+121E j  
 .text:08049BC1         mov   eax, [ebp+ipos]  
 .text:08049BC4         add   eax, 2  
 .text:08049BC7         movzx  eax, byte_804C0C0[eax]  
 .text:08049BCE         movsx  eax, al  
 .text:08049BD1         mov   eax, [ebp+eax*4+regs]  
 .text:08049BD8         mov   [ebp+var_24], eax  
 .text:08049BDB         add   [ebp+ipos], 3  
 .text:08049BDF  
 .text:08049BDF loc_8049BDF:              ; CODE XREF: sub_804898B+1234 j  
 .text:08049BDF         mov   eax, [ebp+stack]  
 .text:08049BE2         sub   eax, 4  
 .text:08049BE5         mov   [ebp+stack], eax  
 .text:08049BE8         mov   eax, [ebp+stack]  
 .text:08049BEB         lea   edx, byte_804C0C0[eax]  
 .text:08049BF1         mov   eax, [ebp+var_24]  
 .text:08049BF4         mov   [edx], eax  
 .text:08049BF6         jmp   short loc_8049C67  

This function reads the second byte of the instruction, and takes one of two paths depending on the value. If it's 0, it goes to 0x08049bc1, otherwise, it ends up branching to 0x08049bdf. Both cases are nearly identical: the stack pointer is decremented by 4 and a value gets written onto the stack. This is quite clearly a PUSH instruction. But why two different cases? Well, case 0 loads the value from a register specified in the instruction, while case 1 puts a constant on the stack. This PUSH instruction can use either a register or a constant as a data source.

Now let's look at 0x1f, which is much simpler:

 .text:08049BF8 loc_8049BF8:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049BF8                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049BF8         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 31  
 .text:08049BFB         add   eax, 1  
 .text:08049BFE         movzx  eax, byte_804C0C0[eax]  
 .text:08049C05         movsx  eax, al  
 .text:08049C08         mov   [ebp+var_24], eax  
 .text:08049C0B         mov   eax, [ebp+stack]  
 .text:08049C0E         add   eax, offset byte_804C0C0  
 .text:08049C13         mov   edx, [eax]  
 .text:08049C15         mov   eax, [ebp+var_24]  
 .text:08049C18         mov   [ebp+eax*4+regs], edx  
 .text:08049C1F         mov   eax, [ebp+stack]  
 .text:08049C22         add   eax, 4  
 .text:08049C25         mov   [ebp+stack], eax  
 .text:08049C28         add   [ebp+ipos], 2  
 .text:08049C2C         jmp   short loc_8049C67  

The second byte of the instruction is loaded into EAX and stored temporarily in [ebp-0x24]. Then we read a 4-byte value from the current position of the stack into EDX. Next, we use the value stored in [ebp-0x24] as a register number. This register number determines which register we write EDX into. Finally, the stack pointer is incremented by 4-bytes. This is clearly a POP instruction. We specify a destination register, and this instruction puts the current stack value inside it before incrementing the stack pointer.

What instruction should we look at next? It might be a good idea to take a second look at MOV. Originally, it looked far too intimidating to statically analyze, so we just stepped through it in GDB. Now we know more about the VM, and can probably figure the rest out.

 .text:08049A02 mov_8049A02:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049A02                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049A02         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 24  
 .text:08049A05         add   eax, 1  
 .text:08049A08         movzx  eax, byte_804C0C0[eax]  
 .text:08049A0F         movsx  eax, al  
 .text:08049A12         mov   [ebp+var_C], eax  
 .text:08049A15         mov   eax, [ebp+var_C]  
 .text:08049A18         test  eax, eax  
 .text:08049A1A         jz   short loc_8049A23  
 .text:08049A1C         cmp   eax, 1  
 .text:08049A1F         jz   short loc_8049A57  
 .text:08049A21         jmp   short loc_8049A81  
 .text:08049A23 ; ---------------------------------------------------------------------------  
 .text:08049A23  
 .text:08049A23 loc_8049A23:              ; CODE XREF: sub_804898B+108F j  
 .text:08049A23         mov   eax, [ebp+ipos]  
 .text:08049A26         add   eax, 2  
 .text:08049A29         movzx  eax, byte_804C0C0[eax]  
 .text:08049A30         movsx  eax, al  
 .text:08049A33         mov   edx, [ebp+ipos]  
 .text:08049A36         add   edx, 3  
 .text:08049A39         movzx  edx, byte_804C0C0[edx]  
 .text:08049A40         movsx  edx, dl  
 .text:08049A43         mov   edx, [ebp+edx*4+regs]  
 .text:08049A4A         mov   [ebp+eax*4+regs], edx  
 .text:08049A51         add   [ebp+ipos], 4  
 .text:08049A55         jmp   short loc_8049A81  
 .text:08049A57 ; ---------------------------------------------------------------------------  
 .text:08049A57  
 .text:08049A57 loc_8049A57:              ; CODE XREF: sub_804898B+1094 j  
 .text:08049A57         mov   eax, [ebp+ipos]  
 .text:08049A5A         add   eax, 2  
 .text:08049A5D         movzx  eax, byte_804C0C0[eax]  
 .text:08049A64         movsx  eax, al  
 .text:08049A67         mov   edx, [ebp+ipos]  
 .text:08049A6A         add   edx, 3  
 .text:08049A6D         add   edx, offset byte_804C0C0  
 .text:08049A73         mov   edx, [edx]  
 .text:08049A75         mov   [ebp+eax*4+regs], edx  
 .text:08049A7C         add   [ebp+ipos], 7  
 .text:08049A80         nop  
 .text:08049A81  
 .text:08049A81 loc_8049A81:              ; CODE XREF: sub_804898B+1096 j  
 .text:08049A81                     ; sub_804898B+10CA j  
 .text:08049A81         jmp   loc_8049C67  

Just like before, it's taking the second byte and using it to determine which case to enter. Let's see what each case does. Case 0 reads the third and fourth bytes of the instruction, then uses them as register numbers. It takes the value of the register specified by the fourth byte and puts it in the register specified by the third byte. Looks like case 0 is a register-register MOV. What about case 1? It also uses the third byte as a destination register, but it instead puts a constant (included in the instruction into that register). Similarly to PUSH, it has two cases: one for a register source and one for a constant source. The destination is always a register.

We know that 0x0f is the CALL instruction. But there are also several other branching instructions we ought to take a look at.

 .text:080496D1 loc_80496D1:              ; CODE XREF: sub_804898B+C0 j  
 .text:080496D1                     ; DATA XREF: .rodata:vm_instrs o  
 .text:080496D1         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 14  
 .text:080496D4         add   eax, 1  
 .text:080496D7         add   eax, offset byte_804C0C0  
 .text:080496DC         mov   eax, [eax]  
 .text:080496DE         mov   [ebp+var_10], eax  
 .text:080496E1         mov   eax, [ebp+var_10]  
 .text:080496E4         mov   [ebp+ipos], eax  
 .text:080496E7         jmp   loc_8049C67  

This reads a new instruction pointer from the opcode data and puts it in ipos. However, the stack is not involved, meaning this is just a normal, unconditional jump. What other kinds of jumps are there?

 .text:080496EC loc_80496EC:              ; CODE XREF: sub_804898B+C0 j  
 .text:080496EC                     ; DATA XREF: .rodata:vm_instrs o  
 .text:080496EC         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 16  
 .text:080496EF         add   eax, 1  
 .text:080496F2         add   eax, offset byte_804C0C0  
 .text:080496F7         mov   eax, [eax]  
 .text:080496F9         mov   [ebp+var_10], eax  
 .text:080496FC         cmp   [ebp+var_28], 0  
 .text:08049700         jz   short loc_804970A  
 .text:08049702         mov   eax, [ebp+ipos]  
 .text:08049705         add   eax, 5  
 .text:08049708         jmp   short loc_804970D  
 .text:0804970A ; ---------------------------------------------------------------------------  
 .text:0804970A  
 .text:0804970A loc_804970A:              ; CODE XREF: sub_804898B+D75 j  
 .text:0804970A         mov   eax, [ebp+var_10]  
 .text:0804970D  
 .text:0804970D loc_804970D:              ; CODE XREF: sub_804898B+D7D j  
 .text:0804970D         mov   [ebp+ipos], eax  
 .text:08049710         jmp   loc_8049C67

This does almost the same thing as JMP, but only modifies ipos if [ebp-0x28] is 0. Recall that on x86, the jz instruction is equivalent to jump-if-equal-to. It subtracts the two comparison operands, and if the result is 0 (indicating equality), it jumps to them. What's here is very similar, so maybe [ebp-0x28] holds the result of that subtraction. That would make it a condition register. If we look further down, many opcodes also check the condition register in similar ways, but with different types of jumps.

If there are multiple jumps only branching if a comparison holds true, there must be some instructions that perform those comparisons. Both 0x16 and 0x17 modify the condition register, so they're good candidates. We'll take a look at 0x16 first:

 .text:080497E2 loc_80497E2:              ; CODE XREF: sub_804898B+C0 j  
 .text:080497E2                     ; DATA XREF: .rodata:vm_instrs o  
 .text:080497E2         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 22  
 .text:080497E5         add   eax, 1  
 .text:080497E8         movzx  eax, byte_804C0C0[eax]  
 .text:080497EF         movsx  eax, al  
 .text:080497F2         mov   [ebp+var_C], eax  
 .text:080497F5         mov   eax, [ebp+var_C]  
 .text:080497F8         cmp   eax, 1  
 .text:080497FB         jz   short loc_804985B  
 .text:080497FD         cmp   eax, 1  
 .text:08049800         jg   short loc_804980B  
 .text:08049802         test  eax, eax  
 .text:08049804         jz   short loc_804981E  
 .text:08049806         jmp   loc_80498E0  
 .text:0804980B ; ---------------------------------------------------------------------------  
 .text:0804980B  
 .text:0804980B loc_804980B:              ; CODE XREF: sub_804898B+E75 j  
 .text:0804980B         cmp   eax, 2  
 .text:0804980E         jz   short loc_804988B  
 .text:08049810         cmp   eax, 4  
 .text:08049813         jz   loc_80498BB  
 .text:08049819         jmp   loc_80498E0  
 .text:0804981E ; ---------------------------------------------------------------------------  
 .text:0804981E  
 .text:0804981E loc_804981E:              ; CODE XREF: sub_804898B+E79 j  
 .text:0804981E         mov   eax, [ebp+ipos]  
 .text:08049821         add   eax, 2  
 .text:08049824         movzx  eax, byte_804C0C0[eax]  
 .text:0804982B         movsx  eax, al  
 .text:0804982E         mov   eax, [ebp+eax*4+regs]  
 .text:08049835         mov   [ebp+var_20], eax  
 .text:08049838         mov   eax, [ebp+ipos]  
 .text:0804983B         add   eax, 3  
 .text:0804983E         movzx  eax, byte_804C0C0[eax]  
 .text:08049845         movsx  eax, al  
 .text:08049848         mov   eax, [ebp+eax*4+regs]  
 .text:0804984F         mov   [ebp+var_1C], eax  
 .text:08049852         add   [ebp+ipos], 4  
 .text:08049856         jmp   loc_80498E0  
 .text:0804985B ; ---------------------------------------------------------------------------  
 .text:0804985B  
 .text:0804985B loc_804985B:              ; CODE XREF: sub_804898B+E70 j  
 .text:0804985B         mov   eax, [ebp+ipos]  
 .text:0804985E         add   eax, 2  
 .text:08049861         movzx  eax, byte_804C0C0[eax]  
 .text:08049868         movsx  eax, al  
 .text:0804986B         mov   eax, [ebp+eax*4+regs]  
 .text:08049872         mov   [ebp+var_20], eax  
 .text:08049875         mov   eax, [ebp+ipos]  
 .text:08049878         add   eax, 3  
 .text:0804987B         add   eax, offset byte_804C0C0  
 .text:08049880         mov   eax, [eax]  
 .text:08049882         mov   [ebp+var_1C], eax  
 .text:08049885         add   [ebp+ipos], 7  
 .text:08049889         jmp   short loc_80498E0  
 .text:0804988B ; ---------------------------------------------------------------------------  
 .text:0804988B  
 .text:0804988B loc_804988B:              ; CODE XREF: sub_804898B+E83 j  
 .text:0804988B         mov   eax, [ebp+ipos]  
 .text:0804988E         add   eax, 2  
 .text:08049891         add   eax, offset byte_804C0C0  
 .text:08049896         mov   eax, [eax]  
 .text:08049898         mov   [ebp+var_20], eax  
 .text:0804989B         mov   eax, [ebp+ipos]  
 .text:0804989E         add   eax, 6  
 .text:080498A1         movzx  eax, byte_804C0C0[eax]  
 .text:080498A8         movsx  eax, al  
 .text:080498AB         mov   eax, [ebp+eax*4+regs]  
 .text:080498B2         mov   [ebp+var_1C], eax  
 .text:080498B5         add   [ebp+ipos], 7  
 .text:080498B9         jmp   short loc_80498E0  
 .text:080498BB ; ---------------------------------------------------------------------------  
 .text:080498BB  
 .text:080498BB loc_80498BB:              ; CODE XREF: sub_804898B+E88 j  
 .text:080498BB         mov   eax, [ebp+ipos]  
 .text:080498BE         add   eax, 2  
 .text:080498C1         add   eax, offset byte_804C0C0  
 .text:080498C6         mov   eax, [eax]  
 .text:080498C8         mov   [ebp+var_20], eax  
 .text:080498CB         mov   eax, [ebp+ipos]  
 .text:080498CE         add   eax, 6  
 .text:080498D1         add   eax, offset byte_804C0C0  
 .text:080498D6         mov   eax, [eax]  
 .text:080498D8         mov   [ebp+var_1C], eax  
 .text:080498DB         add   [ebp+ipos], 0Ah  
 .text:080498DF         nop  
 .text:080498E0  
 .text:080498E0 loc_80498E0:              ; CODE XREF: sub_804898B+E7B j  
 .text:080498E0                     ; sub_804898B+E8E j ...  
 .text:080498E0         mov   eax, [ebp+var_1C]  
 .text:080498E3         mov   edx, [ebp+var_20]  
 .text:080498E6         and   eax, edx  
 .text:080498E8         mov   [ebp+cr], eax  
 .text:080498EB         jmp   loc_8049C67  

It looks like an incredibly complex instruction to begin with. There are multiple code paths depending on the second byte, but now even more than ever. However, we do know that all code paths ultimately lead to 0x080498e0. That local routine does a bitwise AND of two arguments and puts the result in CR. This is exactly what the x86 TEST instruction does, so we're probably looking at the same thing. As for all the cases, it looks daunting but just requires careful investigation. The cases just specify the format of the two source operands:
  • 0 - Register-register
  • 1 - Register-constant
  • 2 - Constant-register
  • 4 - Constant-constant
Knowing how that big set of cases works makes practically every remaining function easy to reverse engineer. 0x17 is pretty much the same, but it just performs a subtraction at the end rather than AND, which is what CMP does. Almost all of the arithmetic instructions also work like this. IDIV, the division instruction, has a bit of weirdness, though. It needs two destination registers, one for the quotient and one for the remainder. This initially tripped me up a little.

There are, however, two other weird instructions, 0x1b and 0x1c. They don't immediately look like anything in the x86 instruction set, so we'll have to investigate them more closely.

 .text:08049AEC loc_8049AEC:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049AEC                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049AEC         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 27  
 .text:08049AEF         add   eax, 1  
 .text:08049AF2         movzx  eax, byte_804C0C0[eax]  
 .text:08049AF9         movsx  eax, al  
 .text:08049AFC         mov   [ebp+var_24], eax  
 .text:08049AFF         mov   eax, [ebp+ipos]  
 .text:08049B02         add   eax, 2  
 .text:08049B05         movzx  eax, byte_804C0C0[eax]  
 .text:08049B0C         movsx  eax, al  
 .text:08049B0F         mov   [ebp+var_20], eax  
 .text:08049B12         add   [ebp+ipos], 3  
 .text:08049B16         mov   eax, [ebp+var_20]  
 .text:08049B19         mov   eax, [ebp+eax*4+regs]  
 .text:08049B20         add   eax, offset byte_804C0C0  
 .text:08049B25         mov   edx, [eax]  
 .text:08049B27         mov   eax, [ebp+var_24]  
 .text:08049B2A         mov   [ebp+eax*4+regs], edx  
 .text:08049B31         mov   eax, [ebp+var_24]  
 .text:08049B34         mov   eax, [ebp+eax*4+regs]  
 .text:08049B3B         mov   [ebp+cr], eax  
 .text:08049B3E         jmp   loc_8049C67  
 .text:08049B43 ; ---------------------------------------------------------------------------  
 .text:08049B43  
 .text:08049B43 loc_8049B43:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049B43                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049B43         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 28  
 .text:08049B46         add   eax, 1  
 .text:08049B49         movzx  eax, byte_804C0C0[eax]  
 .text:08049B50         movsx  eax, al  
 .text:08049B53         mov   [ebp+var_24], eax  
 .text:08049B56         mov   eax, [ebp+ipos]  
 .text:08049B59         add   eax, 2  
 .text:08049B5C         movzx  eax, byte_804C0C0[eax]  
 .text:08049B63         movsx  eax, al  
 .text:08049B66         mov   [ebp+var_20], eax  
 .text:08049B69         add   [ebp+ipos], 3  
 .text:08049B6D         mov   eax, [ebp+var_24]  
 .text:08049B70         mov   eax, [ebp+eax*4+regs]  
 .text:08049B77         add   eax, offset byte_804C0C0  
 .text:08049B7C         mov   edx, [ebp+var_20]  
 .text:08049B7F         mov   edx, [ebp+edx*4+regs]  
 .text:08049B86         mov   [eax], edx  
 .text:08049B88         mov   eax, [eax]  
 .text:08049B8A         mov   [ebp+cr], eax  
 .text:08049B8D         jmp   loc_8049C67  

Both are similar, let's approach 0x1b first. The second and third bytes are both used for register accesses. It reads the data of the register specified by the third byte. This data is then used as an offset into the bytecode area, and the data at that offset is read. Finally, it writes the data into the register specified by the second byte. It might not be obvious at first, but this is just a memory read instruction. It reads memory from the address specified in the source register, and puts that data in the destination register. 0x1c does the same thing in reverse. I decided to call them mem2reg and reg2mem, respectively. They're really just RISC-like load and store instructions.

We could continue and reverse engineer all of the instructions, but it would get really repetitive. The process is basically the same for all of the VM instructions. If we manage to reverse engineer all 33 instructions, we get this list:
  • 0x00: nop
  • 0x01: ret
  • 0x02: add (3 arguments; destination and two source)
  • 0x03: sub (3 arguments; destination and two source)
  • 0x04: imul (3 arguments; destination and two source)
  • 0x05: idiv (4 arguments; quotient/remainder destinations and two source)
  • 0x06: xor (3 arguments; destination and two source)
  • 0x07: neg (2 arguments; destination and source)
  • 0x08: not (2 arguments; destination and source)
  • 0x09: and (3 arguments; destination and two source)
  • 0x0a: or (3 arguments; destination and two source)
  • 0x0b: logicnot (2 arguments; destination and source)
  • 0x0c: shl (3 arguments; destination and two source)
  • 0x0d: sar (3 arguments; destination and two source)
  • 0x0e: jmp (1 argument; offset to go to)
  • 0x0f: call (1 argument; offset to go to)
  • 0x10: jz (1 argument; offset to go to)
  • 0x11: js (1 argument; offset to go to)
  • 0x12: jle (1 argument; offset to go to)
  • 0x13: jg (1 argument; offset to go to)
  • 0x14: jns (1 argument; offset to go to)
  • 0x15: jnz (1 argument; offset to go to)
  • 0x16: test (2 arguments; two operands)
  • 0x17: cmp (2 arguments; two operands)
  • 0x18: mov (2 arguments; destination and source)
  • 0x19: inc (1 argument; is both destination and source)
  • 0x1a: dec (1 argument; is both destination and source)
  • 0x1b: mem2reg (2 arguments; destination and source registers)
  • 0x1c: reg2mem (2 arguments; destination and source registers)
  • 0x1d: hlt
  • 0x1e: push (1 argument; source)
  • 0x1f: pop (1 argument; destination register)
  • 0x20: io (1 argument; function code)
With all these instructions figured out, we can write a short Python script that disassembles the Baleful bytecode. I won't bother documenting that here; writing the disassembler is a fairly easy and boring task. Something more interesting, though, is how we get access to the bytecode in the first place.

Remember back in part 1, when I suspected that the bytecode was being encrypted or packed? I said this because the data looked random, and taking a new look at the VM with our recent discoveries in mind reveals that it's not valid bytecode at all. In fact, we can prove the hypothesis correct by comparing what I dumped from 0x0804db39 in memory to what's in the binary:

 (gdb) x/16xb 0x0804db39                        
  0x804db39:  0x18 0x01 0x00 0x6c 0x00 0x00 0x00 0x0f           
  0x804db41:  0x3f 0x10 0x00 0x00 0x18 0x01 0x00 0x65           
  (gdb)  

 .data:0804DB39         db 19h  
 .data:0804DB3A         db 43h ; C  
 .data:0804DB3B         db 0CFh ; -  
 .data:0804DB3C         db 18h  
 .data:0804DB3D         db  1  
 .data:0804DB3E         db 42h ; B  
 .data:0804DB3F         db 0CFh ; -  
 .data:0804DB40         db 7Bh ; {  
 .data:0804DB41         db 3Eh ; >  
 .data:0804DB42         db 52h ; R  
 .data:0804DB43         db 0CFh ; -  
 .data:0804DB44         db 74h ; t  
 .data:0804DB45         db 19h  
 .data:0804DB46         db 43h ; C  
 .data:0804DB47         db 0CFh ; -  
 .data:0804DB48         db 11h  

They're not even remotely similar. Something is definitely happening to the bytecode. It's likely that what's in memory at the time is valid, so let's just dump that to a file from GDB:

 (gdb) dump memory memdump.bin 0x0804c0c0 0x0804e000  

If we look at memdump.bin, it's much different. Rather than random garbage, we get something that appears to be proper bytecode. If we try disassembling it, it results in something sane, which is also a good sign. Now for all our reverse engineering efforts, we're rewarded with...another binary to reverse engineer. Oh well, c'est la vie (not really).

The ASM code is interesting. It actually has many elements of RISC architectures, like the larger number of registers and dedicated memory access instructions. The ASM generated is quite big, though not as big is Baleful. Let's start at the beginning of the bytecode.

 0x1000: mov r0, 0x103a  
 0x1007: mov r1, 0x1edb  
 0x100e: mov r3, r0  
 0x1012: mov r5, 0x174cf42  
 0x1019: mem2reg r4, r3  
 0x101c: xor r4, r4, r5  
 0x1021: reg2mem r3, r4  
 0x1024: add r3, r3, 0x4  
 0x102c: cmp r1, r3  
 0x1030: jns 0x1019  
 0x1035: jmp 0x103a  

This function, right at the beginning is interesting. Starting at address 0x103a, it reads 4-bytes from memory, XORs them against a constant bitmask, and writes them back to the same spot. It doesn't step until it hits 0x1edb. The purpose of this code is easily apparent: it's decrypting the rest of the bytecode! This is solid proof that the decryption hypothesis was correct. After the decryption, it branches to 0x103a. which itself just goes to 0x1bc0.

(Before going there, though, 0x103f has a bunch of I/O instructions, each one for a specific I/O function in the VM. It's helpful, but not strictly necessary, to give the ones we recognize names. In general, it's a good idea to break up blocks of code separated by RET instructions into separate functions.)

0x1bc0 appears to be the main function of the bytecode program. It starts out with a call to 0x1a6d, which is just this code sequence:

 0x1a6d: mov r0, 0x50  
 0x1a74: call 0x103f  
 0x1a79: mov r0, 0x6c  
 0x1a80: call 0x103f  
 0x1a85: mov r0, 0x65  
 0x1a8c: call 0x103f  
 0x1a91: mov r0, 0x61  
 0x1a98: call 0x103f  
 0x1a9d: mov r0, 0x73  
 0x1aa4: call 0x103f  
 0x1aa9: mov r0, 0x65  
 0x1ab0: call 0x103f  
 0x1ab5: mov r0, 0x20  
 0x1abc: call 0x103f  
 0x1ac1: mov r0, 0x65  
 0x1ac8: call 0x103f  
 0x1acd: mov r0, 0x6e  
 0x1ad4: call 0x103f  
 0x1ad9: mov r0, 0x74  
 0x1ae0: call 0x103f  
 0x1ae5: mov r0, 0x65  
 0x1aec: call 0x103f  
 0x1af1: mov r0, 0x72  
 0x1af8: call 0x103f  
 0x1afd: mov r0, 0x20  
 0x1b04: call 0x103f  
 0x1b09: mov r0, 0x79  
 0x1b10: call 0x103f  
 0x1b15: mov r0, 0x6f  
 0x1b1c: call 0x103f  
 0x1b21: mov r0, 0x75  
 0x1b28: call 0x103f  
 0x1b2d: mov r0, 0x72  
 0x1b34: call 0x103f  
 0x1b39: mov r0, 0x20  
 0x1b40: call 0x103f  
 0x1b45: mov r0, 0x70  
 0x1b4c: call 0x103f  
 0x1b51: mov r0, 0x61  
 0x1b58: call 0x103f  
 0x1b5d: mov r0, 0x73  
 0x1b64: call 0x103f  
 0x1b69: mov r0, 0x73  
 0x1b70: call 0x103f  
 0x1b75: mov r0, 0x77  
 0x1b7c: call 0x103f  
 0x1b81: mov r0, 0x6f  
 0x1b88: call 0x103f  
 0x1b8d: mov r0, 0x72  
 0x1b94: call 0x103f  
 0x1b99: mov r0, 0x64  
 0x1ba0: call 0x103f  
 0x1ba5: mov r0, 0x3a  
 0x1bac: call 0x103f  
 0x1bb1: mov r0, 0x20  
 0x1bb8: call 0x103f  
 0x1bbd: ret  
 0x1bbe: nop  
 0x1bbf: nop  

It's a series of instructions all doing the same thing: loading a byte into r0 and calling 0x103f. 0x103f contains the call to print_char using the I/O instruction, so what's getting loaded into r0 is then printed. If we treat all of these bytes as ASCII characters and try to reconstruct the message, it turns out to be "Please enter your password: ", meaning 0x1a6d prints the password prompt.

Once control returns to main, main loads 0x04 into r0, 0x1e into r1, and calls 0x1080. What does 0x1080 do?

 0x1080: push r30  
 0x1083: push r29  
 0x1086: mov r30, r1  
 0x108a: imul r0, r30, r0  
 0x108f: add r0, r0, 0x8  
 0x1097: mov r29, r0  
 0x109b: io 17  
 0x109d: test r0, r0  
 0x10a1: jz 0x10c2  
 0x10a6: mov r4, r29  
 0x10aa: sar r4, r4, 0x3  
 0x10b2: reg2mem r0, r4  
 0x10b5: add r0, r0, 0x8  
 0x10bd: pop r29  
 0x10bf: pop r30  
 0x10c1: ret  
 0x10c2: nop  

It multiplies r1 (which is 0x1e) by r0 (which is 0x04), adds 8 to it, and stores it in r0. That value becomes an argument to I/O function 0x11. The function then ensures that the I/O function didn't return 0 and writes the block size to the beginning of the returned memory. Finally, it increments the address by 8 and returns it. Although I didn't know for sure what 0x1080 was doing, it reminded me a lot of malloc. It makes a request to the VM which returns a memory address, and then writes some extra data to the beginning of the buffer (which the user does not see), like heap block headers. I assumed it was something like malloc(block_size, num_blocks), which turned out to be correct. You can also confirm this by looking at Baleful, but aren't we tired of that by this point?

So main is allocating an array of 0x1e 4-byte entries. We don't know what it's for at this point. The returned buffer is saved in r11. main then loads 0 into r9 and jumps to 0x1d66.

 0x1d5e: add r9, r9, 0x1  
 0x1d66: mov r30, r9  
 0x1d6a: mov r0, 0x1e  
 0x1d71: cmp r30, r0  
 0x1d75: js 0x1bfb  
 0x1d7a: call 0x104b
 0x1d7f: mov  r1, r0
 0x1d83: mov  r30, r1
 0x1d87: mov  r0, 0xa
 0x1d8e: cmp r30, r0
 0x1d92: jnz 0x1d9c
 0x1d97: jmp 0x1ebe  

We actually start out in the second instruction of an existing function, which is interesting. Anyway, it first checks r9 to make sure it hasn't exceeded 0x1e. If it hasn't exceeded 0x1e, it jumps to 0x1bfb, which we'll look at in a second. If it has exceeded 0x1e, it makes a call to 0x104b, which calls input_char using the I/O instruction. Then it checks the value of the character, and if it's anything besides 0xa (newline, indicating end of input), it prints "Sorry, wrong password!" and terminates. What this appears to be is a password length check. 0x1e is likely the password length, and if we enter more than 0x1e characters (not counting newline), the password is automatically wrong.

That's only if we fail the check, though. If the check succeeds, the function jumps to 0x1bfb, which is shown below.

 0x1bfb: mov r30, r9  
 0x1bff: imul r30, r30, 0x4  
 0x1c07: mov r0, r11  
 0x1c0b: add r30, r30, r0  
 0x1c10: mov r10, r30  
 0x1c14: call 0x104b  
 0x1c19: mov r1, r0  
 0x1c1d: reg2mem r10, r1  
 0x1c20: mem2reg r1, r10  
 0x1c23: mov r30, r1  
 0x1c27: mov r0, 0xa  
 0x1c2e: cmp r30, r0  
 0x1c32: jz 0x1c3c  
 0x1c37: jmp 0x1d5e  

Recall that r9 contains 0 at this point, and r11 has the 0x1e*4-byte array we allocated. It uses r9 as an index into the array and puts that in r10. Then it makes a call to 0x104b, which we already know reads a character from stdin. This character is returned in r0, and gets written to the array spot that we put in r10. So the buffer we allocated is being used to store the password that the user entered. After storing it in the buffer, it once again checks if the character is a newline. In this case, receiving a newline means that the password is less than 0x1e characters, further supporting the idea that 0x1e is the password length. Finally, it goes back to 0x1d5e, which increments r9 by one and repeats the loop.

We've now identified the loop which reads the password from the user. Now what happens after reading it? Assuming the length is correct, control proceeds to 0x1ebe:

 0x1ebe: mov r0, r11  
 0x1ec2: call 0x12a9  

Which simply puts the password buffer in r0 and calls 0x12a9. Since we already have the password, the only thing left to do is check it. That's probably what 0x12a9 is for.

 0x12a9: push r9  
 0x12ac: push r10  
 0x12af: mov r10, r0               # r10 = Password buffer  
 0x12b3: mov r1, 0x1e  
 0x12ba: mov r0, 0x4  
 0x12c1: call 0x1080  
 0x12c6: mov r9, r0                # r9 = Secondary password buffer?  
 0x12ca: mov r1, 0x4  
 0x12d1: mov r0, 0x4  
 0x12d8: call 0x1080  
 0x12dd: mov r5, r0                # r5 = r0 = Unknown 16-byte buffer  

0x12a9 starts out by allocating two new buffers. Both have an unknown purpose, though one of them is the same length as the password buffer, so it's probably related. Afterwards, we just have a bunch of code which fills the buffers up. It's a huge amount of code, so I'm not listing it all here. I'll just put some C code that shows the buffer contents:

 u32 buf1[30] = {0x8d, 0x6f, 0x00, 0x24, 0x98, 0x7c, 0x10, 0x10, 0x9c, 0x60, 0x07, 0x10, 0x8b, 0x63, 0x10, 0x9c, 0x60, 0x07, 0x10, 0x85, 0x61, 0x11, 0x3c, 0xa2, 0x61, 0x0b, 0x10, 0x90, 0x77};  
 u32 buf2[4] = {0xfd, 0x0e, 0x63, 0x4f};  

After the buffers are filled up, we finally get to the meat of the function, which actually checks the password. Both buffers are involved in an interesting way.

 true_check:  
 0x17b5: mov r3, 0x0  
 0x17bc: jmp 0x17c6  
 0x17c1: jmp 0x1878  
 0x17c6: mov r1, 0x0  
 0x17cd: mov r2, 0x0  
 0x17d4: jmp 0x1860
  
 actual_check:  
 0x17d9: mov r30, r2  
 0x17dd: imul r30, r30, 0x4  
 0x17e5: mov r0, r10               # r0 = Password buffer  
 0x17e9: add r30, r30, r0  
 0x17ee: mov r3, r30  
 0x17f2: mem2reg r4, r3               # r4 = password[index]  
 0x17f5: idiv r0, r3, r2, 4           # r3 = index % 4  
 0x17fd: nop  
 0x17fe: mov r30, r3  
 0x1802: imul r30, r30, 0x4  
 0x180a: mov r0, r5                    # r0 = 16-byte buffer  
 0x180e: add r30, r30, r0  
 0x1813: mov r3, r30  
 0x1817: mem2reg r3, r3               # r3 = buf2[index % 4]  
 0x181a: xor r4, r4, r3               # r4 = password[index] ^ buf2[index % 4]  
 0x181f: mov r30, r2  
 0x1823: imul r30, r30, 0x4  
 0x182b: mov r0, r9                    # r0 = Other buffer  
 0x182f: add r30, r30, r0  
 0x1834: mov r3, r30  
 0x1838: mem2reg r3, r3               # r3 = buf1[index]  
 0x183b: mov r30, r4  
 0x183f: mov r0, r3  
 0x1843: cmp r30, r0                    # password[index] ^ buf2[index % 4] == buf1[index]  
 0x1847: jnz 0x1851  
 0x184c: jmp 0x1858  
 0x1851: mov r1, 0x1  
 0x1858: add r2, r2, 0x1
  
 0x1860: mov r3, r1  
 0x1864: mov r30, r2  
 0x1868: mov r0, 0x1e  
 0x186f: cmp r30, r0  
 0x1873: js 0x17d9
  
 end_of_check:  
 0x1878: test r3, r3  
 0x187c: jnz 0x1952 (check_failed)  

r1, r2, and r3 are set to 0 before jumping to 0x1860. It seems r2 is being used to keep track of the current index, since once it gets past 0x1e, the check ends. r3 seems to be a status indicator as to whether the check succeeded. If it's anything put 0, it prints the "Sorry, wrong password!" message, but if it is 0, it prints "Congratulations!" Until r2 hits 0x1e, it's looping through 0x17d9, which has to be what's actually checking the contents of the password.

It uses the index to grab the current character of the password and put it in r4. Then it does something weird: idiv r0, r3, r2, 4. idiv is one of the more complicated instructions done by the VM. It has two different destination registers, one for the quotient of division and one for the remainder. r0, the quotient, gets trashed immediately afterwards, so the program only cares about the remainder in r3. r3 would then equal index % 4.

Why is it computing the index modulo 4? Immediately after this division, it uses index % 4 as an index into that 16-byte buffer. Remember that the buffer contained 4 entries, so this remainder division is making sure we don't go past that buffer. In essence, it rotates through that 16-byte buffer depending on our index in the password. What's read from that buffer is stored in r3.

Finally, it XORs the character we entered with the value read from the 16-byte buffer. It then compares it against the corresponding byte in the other buffer, the one that's the same size as the password buffer. The loop continues either way, but if they're not equal, it ends up putting 1 in r3. This, as we found above, causes the check to ultimately fail and print the "Sorry, wrong password!" message. All the checks have to succeed for the "Congratulations!" message to print.

We now have everything we need to reconstruct the password. buf1 contains the password character we need, XORed with the correct mask for that character's index. XORs can be easily inverted by XORing again with the same mask, so we just need to do that for every character in buf1 and we'll get back the original password. Once this is done, we end up getting the flag packers_and_vms_and_xors_oh_my!

This probably concludes the picoCTF 2014 writeups I'll be doing on this blog. Thanks to the picoCTF team for an amazing set of challenges, especially Baleful, which was probably my favorite of them all. You guys are amazing, and I can't wait to play picoCTF again next year!

Friday, December 5, 2014

picoCTF 2014: Baleful (re200) Part 1

Baleful is the last of the five 200 point master challenges, and the final challenge in picoCTF. It gives us very little information to start off with, simply giving us a "twisted" binary and telling us to get it to accept a password. Since we're just given a binary, there's definitely a reverse engineering element, and like most reversing challenges, the password is probably the flag. Let's jump in!

What happens if we execute Baleful? As expected, there's a password prompt which we have to get past:

 pico59150@shell:~$ ./baleful                                         
 Please enter your password: test                                       
 Sorry, wrong password!                                            
 pico59150@shell:~$  

The only obvious course of action is disassembling Baleful. Before we try to disassemble the binary, it's a good idea to get some basic information about it. Let's try seeing what sections it has:

 pico59150@shell:~$ readelf -S baleful  
 There are no sections in this file.  

Well, that's certainly odd. An ELF file with no sections, yet we can still run it. That seems pretty suspicious. If we view it in a hex editor, there are a few odd things. There appears to be another ELF header after the normal one, and the string "UPX" constantly appears. While there are a few other recognizable strings, there aren't very many. One string, however, is quite revealing:

 Info: This file is packed with the UPX executable packer http://upx.sf.net $  
 $Id: UPX 3.91 Copyright (C) 1996-2013 the UPX Team. All Rights Reserved.  

So it appears this file is packed with UPX, a common packer for executables. What executable packers do is take a program and compress it, while still allowing it to run normally. The program contains some stub code that decompresses the rest of the executable. Packing is often used by malware, but only to decrease the file size. It provides no obfuscation benefit, since we can easily unpack the file. Let's get UPX and do that:

 pico59150@shell:~$ ./upx -d baleful  
             Ultimate Packer for eXecutables  
              Copyright (C) 1996 - 2013  
 UPX 3.91w    Markus Oberhumer, Laszlo Molnar & John Reiser  Sep 30th 2013  
     File size     Ratio   Format   Name  
   --------------------  ------  -----------  -----------  
   148104 <-   6752  4.56% netbsd/elf386 baleful  
 Unpacked 1 file.  

Now we have Baleful in a form that'll be much easier to reverse engineer. Load it into your preferred disassembler (I use IDA) and take a look. A good start would be trying to find the messages that the program prints, but they're not anywhere in the executable. Where could they be, then? A good start might be learning how I/O is done in the first place. Looking at the PLT (procedure linkage table), there are printf(), fputc(), and fgetc() functions. Quite a few things reference them.

 .text:0804867C sub_804867C   proc near        ; CODE XREF: sub_804898B+12C9 p  
 .text:0804867C                     ; DATA XREF: .data:off_804C060 o  
 .text:0804867C  
 .text:0804867C arg_0      = dword ptr 8  
 .text:0804867C  
 .text:0804867C         push  ebp  
 .text:0804867D         mov   ebp, esp  
 .text:0804867F         sub   esp, 18h  
 .text:08048682         mov   edx, ds:stderr  
 .text:08048688         mov   eax, [ebp+arg_0]  
 .text:0804868B         mov   eax, [eax]  
 .text:0804868D         mov   [esp+4], edx  ; stream  
 .text:08048691         mov   [esp], eax   ; c  
 .text:08048694         call  _fputc  
 .text:08048699         mov   eax, ds:stderr  
 .text:0804869E         mov   [esp], eax   ; stream  
 .text:080486A1         call  _fflush  
 .text:080486A6         mov   eax, [ebp+arg_0]  
 .text:080486A9         mov   eax, [eax]  
 .text:080486AB         leave  
 .text:080486AC         retn  
 .text:080486AC sub_804867C   endp  

This function takes a single argument, a pointer to a character, and prints that character to stderr. It then calls fflush to make sure it's actually printed. Let's call this print_char in case we encounter it later. There's an analogous function for character input, which we'll call stdin_getc:

 .text:080486FB sub_80486FB   proc near        ; DATA XREF: .data:0804C070 o  
 .text:080486FB  
 .text:080486FB arg_0      = dword ptr 8  
 .text:080486FB  
 .text:080486FB         push  ebp  
 .text:080486FC         mov   ebp, esp  
 .text:080486FE         sub   esp, 18h  
 .text:08048701         mov   eax, [ebp+arg_0]  
 .text:08048704         mov   [esp], eax  
 .text:08048707         call  sub_80485F4  
 .text:0804870C         mov   eax, ds:stdin  
 .text:08048711         mov   [esp], eax   ; stream  
 .text:08048714         call  _fgetc  
 .text:08048719         leave  
 .text:0804871A         retn  
 .text:0804871A sub_80486FB   endp  

0x080485F4 is a small function that checks if we've reached EOF in stdin, and raises a signal if we have. We can also find some more I/O functions which don't appear to be used. Here's our final list of all I/O functions:
  • 0x0804867C (print_char) - Prints a single character to stderr
  • 0x080486AD (print_dec) - Prints decimal numbers as strings
  • 0x080486D4 (print_hex) - Prints hexadecimal numbers as strings
  • 0x080487A9 (print_float) - Prints floating-point numbers as strings
  • 0x080486FB (stdin_getc) - Read a single character from stdin and return it
  • 0x0804871B (input_dec) - Reads a decimal number from stdin and returns it
  • 0x0804874E (input_hex) - Reads a hexadecimal number from stdin and returns it
  • 0x080487D8 (input_float) - Reads a floating-point number from stdin and returns it
All of these functions deal with basic text I/O. Interestingly enough, they're also all referenced by a table of functions at 0x0804C060. I call it io_ops since all the known functions in it are centered around that purpose:

 .data:0804C060 io_ops     dd offset print_char  ; DATA XREF: sub_804898B+12B9 r  
 .data:0804C064         dd offset print_dec  
 .data:0804C068         dd offset print_hex  
 .data:0804C06C         dd offset print_float  
 .data:0804C070         dd offset stdin_getc  
 .data:0804C074         dd offset input_dec  
 .data:0804C078         dd offset input_hex  
 .data:0804C07C         dd offset input_float  
 .data:0804C080         dd offset sub_8048619  
 .data:0804C084         dd offset sub_8048813  
 .data:0804C088         dd offset sub_8048834  
 .data:0804C08C         dd offset sub_804887B  
 .data:0804C090         dd offset sub_80488B6  
 .data:0804C094         dd offset sub_80488F1  
 .data:0804C098         dd offset sub_804892C  
 .data:0804C09C         dd offset sub_8048660  
 .data:0804C0A0         dd offset sub_804866A  
 .data:0804C0A4         dd offset sub_8048967  
 .data:0804C0A8         align 20h  

Is print_char used by Baleful to print the messages? That's not incredibly efficient, but would help obfuscate the program. We can find out by placing a GDB breakpoint on print_char and seeing what happens:

 pico59150@shell:~$ gdb baleful                                        
 GNU gdb (Ubuntu 7.7-0ubuntu3.1) 7.7                                     
 Copyright (C) 2014 Free Software Foundation, Inc.                              
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>                
 This is free software: you are free to change and redistribute it.                      
 There is NO WARRANTY, to the extent permitted by law. Type "show copying"                  
 and "show warranty" for details.                                       
 This GDB was configured as "x86_64-linux-gnu".                                
 Type "show configuration" for configuration details.                             
 For bug reporting instructions, please see:                                 
 <http://www.gnu.org/software/gdb/bugs/>.                                   
 Find the GDB manual and other documentation resources online at:                       
 <http://www.gnu.org/software/gdb/documentation/>.                              
 For help, type "help".                                            
 Type "apropos word" to search for commands related to "word"...                       
 Reading symbols from baleful...(no debugging symbols found)...done.                     
 (gdb) b *0x0804867C                                             
 Breakpoint 1 at 0x804867c                                          
 (gdb) run                                                  
 Starting program: /home_users/pico59150/baleful                               
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 P                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 l                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 e                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 a                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 s                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb) cont                                                  
 Continuing.                                                 
 e                                                      
 Breakpoint 1, 0x0804867c in ?? ()                                      
 (gdb)    

Looks like that hypothesis is correct. Each time we execute print_char, the password prompt ("Please enter your password") gets printed out one character at a time. Whatever function is calling print_char is probably involved with printing out the message. Let's see where we were called from by viewing the return address on the stack:

  (gdb) info registers                         
  eax   0xffffd5c4  -10812                    
  ecx   0xf7fc988c  -134440820                   
  edx   0x804867c  134514300                   
  ebx   0xffffd690  -10608                    
  esp   0xffffd5ac  0xffffd5ac                   
  ebp   0xffffd678  0xffffd678                   
  esi   0x0  0                       
  edi   0xffffd70c  -10484                    
  eip   0x804867c  0x804867c                   
  eflags   0x212 [ AF IF ]                     
  cs    0x23  35                       
  ss    0x2b  43                       
  ds    0x2b  43                       
  es    0x2b  43                       
  fs    0x0  0                       
  gs    0x63  99                       
  (gdb) x 0xffffd5ac                         
  0xffffd5ac:  0x08049c56                       
  (gdb)  

The return address is 0x08049c56. Let's view the code in the vicinity of that:

 .text:08049C2E loc_8049C2E:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049C2E                     ; DATA XREF: .rodata:off_8049DD4 o  
 .text:08049C2E         mov   eax, [ebp+var_34] ; jumptable 08048A4B case 32  
 .text:08049C31         add   eax, 1  
 .text:08049C34         movzx  eax, byte_804C0C0[eax]  
 .text:08049C3B         movsx  eax, al  
 .text:08049C3E         mov   [ebp+var_24], eax  
 .text:08049C41         mov   eax, [ebp+var_24]  
 .text:08049C44         mov   edx, io_ops[eax*4]  
 .text:08049C4B         lea   eax, [ebp+var_B4]  
 .text:08049C51         mov   [esp], eax  
 .text:08049C54         call  edx ; print_char  
 .text:08049C56         mov   [ebp+var_B4], eax  
 .text:08049C5C         add   [ebp+var_34], 2  
 .text:08049C60         jmp   short loc_8049C67  
 .text:08049C62 ; ---------------------------------------------------------------------------  
 .text:08049C62  
 .text:08049C62 loc_8049C62:              ; CODE XREF: sub_804898B+B3 j  
 .text:08049C62                     ; sub_804898B+C0 j  
 .text:08049C62                     ; DATA XREF: ...  
 .text:08049C62         add   [ebp+var_34], 1 ; jumptable 08048A4B default case  
 .text:08049C66         nop  
 .text:08049C67  
 .text:08049C67 loc_8049C67:              ; CODE XREF: sub_804898B+9D j  
 .text:08049C67                     ; sub_804898B+C6 j ...  
 .text:08049C67         mov   eax, [ebp+var_34]  
 .text:08049C6A         add   eax, offset byte_804C0C0  
 .text:08049C6F         movzx  eax, byte ptr [eax]  
 .text:08049C72         cmp   al, 1Dh  
 .text:08049C74         jnz   loc_8048A2D  
 .text:08049C7A         mov   eax, [ebp+var_B4]  
 .text:08049C80  
 .text:08049C80 locret_8049C80:             ; CODE XREF: sub_804898B+E4 j  
 .text:08049C80         leave  
 .text:08049C81         retn  
 .text:08049C81 sub_804898B   endp  

The highlighted text is where the actual call took place. Let's look back a bit to see where we came from. We can see that this is case 32 in some unknown jumptable. The first thing it does is read a 4-byte value from [ebp-0x34]. This value is used as an offset into some memory area at 0x804C0C0. This function reads the byte at 0x804C0C0+offset+1. What we can deduce from this is that there's some data structure pointed to by offset, and this function takes its second byte. That byte is used as an index into io_ops, from which a function is read and then called (in the highlighted line). The argument to the function is taken from [ebp-0xb4], and the return value is put there afterwards.

Once the I/O function has been completed, it increments the offset in [ebp-0x34] by 2 and calls 0x08049c67. 0x08049c67 reads a byte at the new offset and then compares it to 0x1d. If it is 0x1d, it just returns from whatever function we're in, but otherwise, it jumps to 0x08048a2d. It's not exactly clear what the function is doing at this point, so let's see what happens at 0x08048a2d:

 .text:08048A2D loc_8048A2D:              ; CODE XREF: sub_804898B+12E9 j  
 .text:08048A2D         mov   eax, [ebp+var_34]  
 .text:08048A30         add   eax, offset byte_804C0C0  
 .text:08048A35         movzx  eax, byte ptr [eax]  
 .text:08048A38         movsx  eax, al  
 .text:08048A3B         cmp   eax, 20h    ; switch 33 cases  
 .text:08048A3E         ja   loc_8049C62   ; jumptable 08048A4B default case  
 .text:08048A44         mov   eax, ds:off_8049DD4[eax*4]  
 .text:08048A4B         jmp   eax       ; switch jump  

Looks like 0x08048a2d is the jumptable dispatcher. It once again uses [ebp-0x34] as an offset into 0x0804c0c0, a pattern that's starting to emerge. It takes the first byte at that offset and uses it as an index into the jumptable. Recall that 0x8049c2e is a jumptable case, so it gets called directly from here. It looked at the second byte at the offset, and used that as a parameter. So the data pointed to by [ebp-0x34] always starts with a jumptable index, and then contains some case-specific data afterwards.

What is at 0x0804c0c0 anyway? As it turns out, there's absolutely nothing but zeroes for the first 0x1000 bytes. Then there are some bytes which appear normal, though their purpose isn't yet known. But as we get to 0x0804D0F0, the data starts to lose any noticeable patterns and appears to be fairly random. It looks like there's some sort of encryption or packing going on. We'll get back to that much later.

Now, it still wasn't completely clear what I was dealing with, but I began to have a hunch that this was a bytecode VM. The theory makes sense: it has an offset into some data area, it uses the first byte at that offset to choose one of many cases, each case can read additional data from that offset, and it always increments the offset after it finishes. Recall that 0x08049c2e, the one which called all the I/O functions, used the second byte at the offset only. Then it incremented the offset by 2 when it finished, and went back to the main dispatcher. If the VM theory is correct, Baleful is advancing an instruction pointer and dispatching the next one.

The VM theory was actually quite plausible, so I decided to run with it. If it was true, that meant that everything in the 0x0804c0c0 area was a bytecode program that actually did everything. The I/O meta-function at 0x08049c2e would just be an instruction called by the bytecode program to communicate with the outside world. As obfuscation mechanisms go, it's a fairly good one. The new goal should be understanding enough of the VM to write a disassembler and reverse engineer the bytecode program.

 .rodata:08049DD4 vm_instrs    dd offset loc_8048A4D  ; DATA XREF: sub_804898B+B9 r  
 .rodata:08049DD4         dd offset loc_8048A56  ; jump table for switch statement  
 .rodata:08049DD4         dd offset loc_8048A8F  
 .rodata:08049DD4         dd offset loc_8048BC4  
 .rodata:08049DD4         dd offset loc_8048CF9  
 .rodata:08049DD4         dd offset loc_8048E2F  
 .rodata:08049DD4         dd offset loc_8048F91  
 .rodata:08049DD4         dd offset loc_80495F5  
 .rodata:08049DD4         dd offset loc_8049649  
 .rodata:08049DD4         dd offset loc_80490C6  
 .rodata:08049DD4         dd offset loc_80491FB  
 .rodata:08049DD4         dd offset loc_804959E  
 .rodata:08049DD4         dd offset loc_8049330  
 .rodata:08049DD4         dd offset loc_8049467  
 .rodata:08049DD4         dd offset loc_80496D1  
 .rodata:08049DD4         dd offset loc_804969D  
 .rodata:08049DD4         dd offset loc_80496EC  
 .rodata:08049DD4         dd offset loc_8049715  
 .rodata:08049DD4         dd offset loc_804973E  
 .rodata:08049DD4         dd offset loc_8049767  
 .rodata:08049DD4         dd offset loc_8049790  
 .rodata:08049DD4         dd offset loc_80497B9  
 .rodata:08049DD4         dd offset loc_80497E2  
 .rodata:08049DD4         dd offset loc_80498F0  
 .rodata:08049DD4         dd offset loc_8049A02  
 .rodata:08049DD4         dd offset loc_8049A86  
 .rodata:08049DD4         dd offset loc_8049AB9  
 .rodata:08049DD4         dd offset loc_8049AEC  
 .rodata:08049DD4         dd offset loc_8049B43  
 .rodata:08049DD4         dd offset loc_8049C62  
 .rodata:08049DD4         dd offset loc_8049B92  
 .rodata:08049DD4         dd offset loc_8049BF8  
 .rodata:08049DD4         dd offset io_8049C2E  

There's a huge, intimidating jumptable staring us in the face, and of the 33 instructions there, we have only a single one. Let's try and see which ones are easy enough to identify right away.

 .text:08048A4D loc_8048A4D:              ; DATA XREF: .rodata:vm_instrs o  
 .text:08048A4D         add   [ebp+ipos], 1 ; jumptable 08048A4B case 0  
 .text:08048A51         jmp   loc_8049C67  

A case that does absolutely nothing but increment the instruction pointer (I now call it ipos). I'm willing to bet this is the equivalent of NOP on basically every CPU architecture. This is probably some sort of assembly language bytecode, then. Two instructions down, 31 to go. What else can we identify?

Well, for me, pretty much nothing at all. 0x08049C62, which implements opcode 0x1d, is fairly easy to identify as the VM termination instruction, but that's not much help. Every other function just seemed incomprehensible from a static analysis perspective, using a bunch of local variables that I didn't know the meaning of. So I decided to go back to GDB, tracing the execution path of the program after the I/O dispatcher (opcode 0x20).

Let's restart the program and set two breakpoints, one on the main VM loop and one inside the I/O dispatcher. We want to start debugging after the first I/O call, though, so we need to set up the breakpoint then:

 pico59150@shell:~$ gdb baleful                                        
 GNU gdb (Ubuntu 7.7-0ubuntu3.1) 7.7                                     
 Copyright (C) 2014 Free Software Foundation, Inc.                              
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>                
 This is free software: you are free to change and redistribute it.                      
 There is NO WARRANTY, to the extent permitted by law. Type "show copying"                  
 and "show warranty" for details.                                       
 This GDB was configured as "x86_64-linux-gnu".                                
 Type "show configuration" for configuration details.                             
 For bug reporting instructions, please see:                                 
 <http://www.gnu.org/software/gdb/bugs/>.                                   
 Find the GDB manual and other documentation resources online at:                       
 <http://www.gnu.org/software/gdb/documentation/>.                              
 For help, type "help".                                            
 Type "apropos word" to search for commands related to "word"...                       
 Reading symbols from baleful...(no debugging symbols found)...done.                     
 (gdb) b *0x08049C54                                             
 Breakpoint 1 at 0x8049c54                                          
 (gdb) run                                                  
 Starting program: /home_users/pico59150/baleful                               
 Breakpoint 1, 0x08049c54 in ?? ()                                      
 (gdb) b *0x08048A2D                                             
 Breakpoint 2 at 0x8048a2d                                          
 (gdb)    

We want to see what instructions are being executed, so let's view the opcode every time we hit the main dispatcher:

 (gdb) cont                                                  
 Continuing.                                                 
 P                                                      
 Breakpoint 2, 0x08048a2d in ?? ()                                      
 (gdb) si                                                   
 0x08048a30 in ?? ()                                             
 (gdb) info registers                                             
 eax      0x1041  4161                                         
 ecx      0xf7fc988c    -134440820                                  
 edx      0x0   0                                          
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a30    0x8048a30                                  
 eflags     0x297  [ CF PF AF SF IF ]                                  
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) si                                                   
 0x08048a35 in ?? ()                                             
 (gdb) si                                                   
 0x08048a38 in ?? ()                                             
 (gdb) si                                                   
 0x08048a3b in ?? ()                                             
 (gdb) info registers                                             
 eax      0x1   1                                          
 ecx      0xf7fc988c    -134440820                                  
 edx      0x0   0                                          
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a3b    0x8048a3b                                  
 eflags     0x202  [ IF ]                                        
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb)    

Offset is 0x1041, opcode is 0x1. Let's keep doing this for a while.

 (gdb) cont                                                  
 Continuing.                                                 
 Breakpoint 2, 0x08048a2d in ?? ()                                         
 (gdb) si                                                   
 0x08048a30 in ?? ()                                             
 (gdb) info registers                                             
 eax      0x1a79  6777                                         
 ecx      0xf7fc988c    -134440820                                  
 edx      0x0   0                                          
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a30    0x8048a30                                  
 eflags     0x293  [ CF AF SF IF ]                                   
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) si                                                   
 0x08048a35 in ?? ()                                             
 (gdb) si                                                   
 0x08048a38 in ?? ()                                             
 (gdb) si                                                   
 0x08048a3b in ?? ()                                             
 (gdb) info registers                                             
 eax      0x18   24                                          
 ecx      0xf7fc988c    -134440820                                  
 edx      0x0   0                                          
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a3b    0x8048a3b                                  
 eflags     0x206  [ PF IF ]                                      
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) cont                                                  
 Continuing.                                                 
 Breakpoint 2, 0x08048a2d in ?? ()                                      
 (gdb) si                                                   
 0x08048a30 in ?? ()                                             
 (gdb) info registers                                             
 eax      0x1a80  6784                                         
 ecx      0xf7fc988c    -134440820                                  
 edx      0x6c   108                                         
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a30    0x8048a30                                  
 eflags     0x283  [ CF SF IF ]                                     
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) si                                                   
 0x08048a35 in ?? ()                                             
 (gdb) si                                                   
 0x08048a38 in ?? ()                                             
 (gdb) si                                                   
 0x08048a3b in ?? ()                                             
 (gdb) info registers                                             
 eax      0xf   15                                          
 ecx      0xf7fc988c    -134440820                                  
 edx      0x6c   108                                         
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a3b    0x8048a3b                                  
 eflags     0x202  [ IF ]                                        
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) cont                                                  
 Continuing.                                                 
 Breakpoint 2, 0x08048a2d in ?? ()                                      
 (gdb) si                                                   
 0x08048a30 in ?? ()                                             
 (gdb) info registers                                             
 eax      0x103f  4159                                         
 ecx      0xf7fc988c    -134440820                                  
 edx      0x1a85  6789                                         
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a30    0x8048a30                                  
 eflags     0x216  [ PF AF IF ]                                     
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb) si                                                   
 0x08048a35 in ?? ()                                             
 (gdb) si                                                   
 0x08048a38 in ?? ()                                             
 (gdb) si                                                   
 0x08048a3b in ?? ()                                             
 (gdb) info registers                                             
 eax      0x20   32                                          
 ecx      0xf7fc988c    -134440820                                  
 edx      0x1a85  6789                                         
 ebx      0xffffd690    -10608                                    
 esp      0xffffd5b0    0xffffd5b0                                  
 ebp      0xffffd678    0xffffd678                                  
 esi      0x0   0                                          
 edi      0xffffd70c    -10484                                    
 eip      0x8048a3b    0x8048a3b                                  
 eflags     0x206  [ PF IF ]                                      
 cs       0x23   35                                          
 ss       0x2b   43                                          
 ds       0x2b   43                                          
 es       0x2b   43                                          
 fs       0x0   0                                          
 gs       0x63   99                                          
 (gdb)    

Let's summarize what we see here. It's pretty important, since what I found while doing this was the big breakthrough that allowed me to quickly figure out the rest of the VM. After leaving the I/O instruction, we're at offset 0x1041. We can infer from this that the I/O instruction was at 0x103f. The opcode there is 0x1. After running opcode 0x1, we go back to the main VM loop, but the offset has suddenly jumped by a lot to 0x1a79. The most likely explanation is that opcode 0x1 modified ipos. Opcode 0x18 is then run, which seems to have a lot of associated data, as the ipos after that is 0x1a80. The opcode at 0x1a80 is 0xf. Afterwards, we can observe that ipos has changed back to 0x103f, about to run the I/O instruction again. 0xf is the only candidate for what modified ipos.

If we allow the program to continue and run the I/O instruction, now the "l" character gets printed. This is the second character in the password prompt. So the I/O instruction runs, then opcode 0x1 runs and ipos completely changes as a result. At the new ipos, we run opcode 0x18 and 0xf. As a result of executing opcode 0xf, ipos goes back to what it was before and runs the I/O instruction again. But we have new data being passed to it! Which instruction did that? Assuming that each opcode does one specific thing, 0x18 is the only possibility.

The way this is arranged made me think of returning from a function, getting some new data, and then going back to that same function. That would make 0x1 RET and 0xf CALL. And since 0x18 gets the new character to be printed, it's probably similar to MOV. As it turns out, simply figuring that out is amazingly helpful towards getting all 33 opcodes figured out. RET and CALL deal with the stack, and MOV deals with registers, so they'll shed a lot of light on the VM.

Let's look at both 0x1 and 0xf first, which we believe are CALL and RET:

 .text:08048A56 ret_8048A56:              ; CODE XREF: sub_804898B+C0 j  
 .text:08048A56                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08048A56         mov   eax, [ebp+var_38] ; jumptable 08048A4B case 1  
 .text:08048A59         add   eax, offset byte_804C0C0  
 .text:08048A5E         mov   eax, [eax]  
 .text:08048A60         mov   [ebp+var_14], eax  
 .text:08048A63         cmp   [ebp+var_14], 0  
 .text:08048A67         jnz   short loc_8048A74  
 .text:08048A69         mov   eax, [ebp+var_B4]  
 .text:08048A6F         jmp   locret_8049C80  
 .text:08048A74 ; ---------------------------------------------------------------------------  
 .text:08048A74  
 .text:08048A74 loc_8048A74:              ; CODE XREF: sub_804898B+DC j  
 .text:08048A74         mov   eax, [ebp+var_38]  
 .text:08048A77         add   eax, 4  
 .text:08048A7A         mov   [ebp+var_38], eax  
 .text:08048A7D         mov   eax, [ebp+var_14]  
 .text:08048A80         mov   [ebp+ipos], eax  
 .text:08048A83         mov   [ebp+var_28], 0  
 .text:08048A8A         jmp   loc_8049C67  

 .text:0804969D call_804969D:              ; CODE XREF: sub_804898B+C0 j  
 .text:0804969D                     ; DATA XREF: .rodata:vm_instrs o  
 .text:0804969D         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 15  
 .text:080496A0         add   eax, 1  
 .text:080496A3         add   eax, offset byte_804C0C0  
 .text:080496A8         mov   eax, [eax]  
 .text:080496AA         mov   [ebp+var_10], eax  
 .text:080496AD         mov   eax, [ebp+var_38]  
 .text:080496B0         sub   eax, 4  
 .text:080496B3         mov   [ebp+var_38], eax  
 .text:080496B6         mov   eax, [ebp+var_38]  
 .text:080496B9         add   eax, offset byte_804C0C0  
 .text:080496BE         mov   edx, [ebp+ipos]  
 .text:080496C1         add   edx, 5  
 .text:080496C4         mov   [eax], edx  
 .text:080496C6         mov   eax, [ebp+var_10]  
 .text:080496C9         mov   [ebp+ipos], eax  
 .text:080496CC         jmp   loc_8049C67  

We can see that both of these functions use [ebp-0x38]. RET uses it as an offset into the 0x0804c0c0 area, reading a 4-byte value from that offset. It increments the offset in [ebp-0x38] by 4 bytes and then stores the value it previously read in ipos. CALL does the opposite of this. It first reads a 4-byte value as part of the bytecode instruction, which is presumably the offset we want to jump to. It then decrements [ebp-0x38] by 4 bytes and stores the current value of ipos + 5, which would point to the instruction after the CALL. Then it puts the previously read value into ipos. These functions make it obvious that 0x1 and 0xf are indeed RET and CALL, and furthermore, that [ebp-0x38] is a stack pointer offset. Very useful to know.

Now we want to look at MOV to see how it works, but the function looks very complex at first glance. Let's again use GDB to trace through it, at the point where the "l" character gets loaded. Opcode 0x18, which is MOV, has its case at 0x08049A02. Let's see which part of the case we end up branching to.

 .text:08049A02 loc_8049A02:              ; CODE XREF: sub_804898B+C0 j  
 .text:08049A02                     ; DATA XREF: .rodata:vm_instrs o  
 .text:08049A02         mov   eax, [ebp+ipos] ; jumptable 08048A4B case 24  
 .text:08049A05         add   eax, 1  
 .text:08049A08         movzx  eax, byte_804C0C0[eax]  
 .text:08049A0F         movsx  eax, al  
 .text:08049A12         mov   [ebp+var_C], eax  
 .text:08049A15         mov   eax, [ebp+var_C]  
 .text:08049A18         test  eax, eax  
 .text:08049A1A         jz   short loc_8049A23  
 .text:08049A1C         cmp   eax, 1  
 .text:08049A1F         jz   short loc_8049A57  
 .text:08049A21         jmp   short loc_8049A81  

There are several cases we could end up in, let's just set breakpoints on all of them and see which one we hit.

 (gdb) break *0x8049A23                                            
 Breakpoint 3 at 0x8049a23                                          
 (gdb) break *0x8049A57                                            
 Breakpoint 4 at 0x8049a57                                          
 (gdb) cont                                                  
 Continuing.                                                 
 Breakpoint 4, 0x08049a57 in ?? ()                                      
 (gdb)  

Alright, we end up getting to 0x08049a57. We should first dump the data ipos points to:

 (gdb) x $ebp-0x34                                              
 0xffffd644:   0x00001a79                                          
 (gdb) x/16xb 0x0804db39                                           
 0x804db39:   0x18  0x01  0x00  0x6c  0x00  0x00  0x00  0x0f                 
 0x804db41:   0x3f  0x10  0x00  0x00  0x18  0x01  0x00  0x65                 
 (gdb)  

The data highlighted in red is the actual MOV instruction. Here we see 0x18, the opcode, 0x01, which is used to decide the case, a 0x00 byte, and the 4-byte value 0x6c (little endian). 0x6c is simply the ASCII code for the letter "l", so this is almost certainly loading the next character to print. Now that we know the data, how will the function play out?

 .text:08049A57 loc_8049A57:              ; CODE XREF: sub_804898B+1094 j  
 .text:08049A57         mov   eax, [ebp+ipos]  
 .text:08049A5A         add   eax, 2  
 .text:08049A5D         movzx  eax, byte_804C0C0[eax]  
 .text:08049A64         movsx  eax, al  
 .text:08049A67         mov   edx, [ebp+ipos]  
 .text:08049A6A         add   edx, 3  
 .text:08049A6D         add   edx, offset byte_804C0C0  
 .text:08049A73         mov   edx, [edx]  
 .text:08049A75         mov   [ebp+eax*4+var_B4], edx  
 .text:08049A7C         add   [ebp+ipos], 7  
 .text:08049A80         nop  

It loads the third byte of our instruction, which will be 0x00. Then it loads the 4-byte value after that, which is 0x6c. Finally, it loads that value into [ebp-0xb4+eax], which is just [ebp-0xb4] in this case. Recall that [ebp-0xb4] contained the argument given to print_char. However, it turns out that ebp-0xb4 is not just the address of one 4-byte value, but an entire array of them. Depending on the third byte, the MOV instruction can write to any index of this array. MOV usually loads values into registers, so it would appear that ebp-0xb4 is a register area for the VM bytecode. And if ebp-0xb4 is a register area, this means that the I/O opcode always passes the value of the first register as an argument to whichever I/O function is called.

Now we've figured out a lot more about how data is accessed inside the VM. Knowing about both the register and stack implementations makes it very easy to figure out the rest of the instructions. Since this article is already quite long, we'll pick this back up in part 2. In the second part of this write-up, we'll figure out the rest of the instructions, write a disassembler for the bytecode, and then reverse engineer the bytecode to get the flag.