Thursday, March 21, 2013

Binary Instrumentation for Exploit Analysis Purposes (part 2)


This is the second part of the article about binary instrumentation for exploit analysis purposes and this time we will discuss a real pdf exploit: a Stack-based buffer overflow in CoolType.dll (CVE-2010-2883). You can retrieve it from the metasploit module exploit/windows/fileformat/adobe_cooltype_sing .

In order to bypass DEP, this exploit makes use of Heap Spraying to run its ROP shellcode. On the other hand, our goal is to come closer to the point where the vulnerability occurs, so one clever thing to do is to use Pintool to detect the ROP itself.

To do that, we can simply check if the instruction executed after a RET is located after a CALL, but be aware that performing this test alone could lead to false positives. A better test would be to control wether this check works for three times in a row, but this gives rise to some Pintool's problems that we will discuss later.
Another method to detect ROP is to control the ESP register and look for the "0c0c0c0c" value, but inspecting the register with Pin is very slow and will degrade the performance of your Pintool. So we won't implement this one.
Finally, one last check is to log the "pop ESP" instruction, that is a common ROP gadget employed right before the ROP shellcode itself.

Detecting the ROP with a Pintool.

Here is the function to detect the ROP:

#define LAST_EXECUTED 1000

UINT32 LastExecutedPos = 0;
UINT32 PreviousOpcode;
char TempString[12];

#define     PREV_OPCODE(__dist) (((UINT16*)(AddrEip - __dist))[0])

typedef struct _OPC_CHECK  
 UINT8 Delta;
 UINT16 Opcode;

OPC_CHECK OpcCheck[] = 
 6, 0x15ff, 2, 0x12ff, 2, 0x11ff, 2, 0x13ff, 2, 0x17ff, 2, 0x16ff, 2, 0x10ff, 
 3, 0x55ff, 3, 0x50ff, 3, 0x51ff, 3, 0x52ff, 3, 0x53ff, 4, 0x54ff, 3, 0x55ff, 
 3, 0x56ff, 3, 0x57ff, 3, 0x59ff, 6, 0x95ff, 6, 0x97ff, 6, 0x76ff, 6, 0x96ff, 
 6, 0x94ff, 6, 0x93ff, 6, 0x92ff, 6, 0x91ff, 6, 0x90ff, 7, 0x14ff, 7, 0x94ff, 
 3, 0x14ff, 4, 0x54ff, 2, 0xd0ff, 2, 0xd1ff, 2, 0xd2ff, 2, 0xd3ff, 2, 0xd4ff, 
 2, 0xd5ff, 2, 0xd6ff, 2, 0xd7ff, 0, 0

char* QuickDwordToString(char *String, UINT32 Value)
 int i;
 UINT32 TempVal = Value;
 UINT8 TempByte;

 for(i = 0; i < 8; i++)
  TempByte = (TempVal & 0xF) + 0x30;
  if(TempByte > 0x39) TempByte += 7;
  String[7-i] = TempByte;
  TempVal >>= 4;

 return String;

VOID DetectPopEsp(ADDRINT AddrEip, UINT32 Opcode) 
 UINT32 i, k;

 if(PreviousOpcode == 557 &&   // int for RET
  AddrEip < 0x70000000 &&
  ((UINT8*)(AddrEip-5))[0] != 0xE8)
  k = 0;
  while(OpcCheck[k].Delta != 0)
   if( PREV_OPCODE(OpcCheck[k].Delta) == OpcCheck[k].Opcode)


  if(OpcCheck[k].Delta == 0)
   fprintf(OutTrace, "%s RETurned here, but not after call\n", QuickDwordToString(TempString, AddrEip));

 if(Opcode == 486)   // int for POP
  if(((UINT8*)AddrEip)[0] == 0x5C)
   fprintf(OutTrace, "%s  POP ESP DETECTED!!\n", QuickDwordToString(TempString, AddrEip)); 
   fprintf(OutTrace,"Dumping list of previously executed EIPs \n");
   // dump last executed buffer on file
   for(i = LastExecutedPos; i < LAST_EXECUTED; i++)
    fprintf(OutTrace, "%s\n", QuickDwordToString(TempString, LastExecutedBuf[i])); 
   for(i = 0; i < LastExecutedPos; i++)
    fprintf(OutTrace, "%s\n", QuickDwordToString(TempString, LastExecutedBuf[i])); 
   fprintf(OutTrace, "%s\n", QuickDwordToString(TempString, AddrEip)); 

 LastExecutedBuf[LastExecutedPos] = AddrEip;
 if(LastExecutedPos >= LAST_EXECUTED)
  // circular logging
  LastExecutedPos = 0;

 PreviousOpcode = Opcode;

Include it in the source code of the basic Pintool provided in the first part of the article and use the following line:


in the "Instruction()" function to call the "DetectEip()" function before every instruction is executed.

Also, add these lines:

UINT32 Opcode;

va_list VaList;
va_start( VaList, AddrEip);

Opcode = va_arg(VaList, UINT32);


DetectPopEsp(AddrEip, Opcode);

in the "DetectEip()" function (where specified by the comments).

Now a brief description of what the code does. Basically, this Pintool looks for two opcodes: the one corresponding to RET (Pin code 557) and the one corresponding to POP (Pin code 486).

If a RET is encountered, the Pintool follows it and checks if the previous opcode is a CALL, looking for the E8 opcode or the ones provided in the "OpcCheck[].Opcode" array (the list may not be complete, but while testing it was reasonably accurate). In case it's not, it notifies the user with the message: "*Address* RETurned here, but not after call".

If a POP is encountered, it checks if it is a "POP ESP" and, in case it is, it notifies the user by printing "*Adress* POP ESP DETECTED!!" and dumps the last executed instructions on file.

That's it. You are finally ready to compile the Pintool and run it within Adobe Acrobat Reader to analyse the PDF exploit.

Analyzing the output

Here is an excerpt from the output produced by the Pintool:

Exception handler address: 7C91EAEC 
Starting Pintool
Loading module C:\Programmi\Adobe\Reader 9.0\Reader\AcroRd32.exe 
Main exe Base: 00400000  End: 00453FFF
Loading module C:\WINDOWS\system32\kernel32.dll 
Module Base: 7C800000 
Module end: 7C8FEFFF 
Loading module C:\WINDOWS\system32\ntdll.dll 
Module Base: 7C910000 
Module end: 7C9C5FFF 
Starting thread 0
0D6D8192 RETurned here, but not after call
02D43FA5 RETurned here, but not after call
22326DB0 RETurned here, but not after call
5B18174F RETurned here, but not after call
08171CF0 RETurned here, but not after call
08171D47 RETurned here, but not after call
06066EED RETurned here, but not after call
0633DE6B RETurned here, but not after call
4A82A714 RETurned here, but not after call
Dumping list of previously executed EIPs 

From the log above we can see all the modules being loaded and threads being created. Then, we notice some false positives: these are legitimate RETs, which don't return to an instruction after a CALL.
Finally, we get to the part where both checks are detected: the code returns to an instruction not located after a call and a "POP ESP" instruction is executed.

In particular, the last logged EIPs correspond to following ROP gadgets:

 4A80CB38   81C5 94070000    ADD EBP,794
 4A80CB3E   C9               LEAVE
 4A80CB3F   C3               RETN

 4A82A714   5C               POP ESP
(4A82A715   C3               RETN)

So we have located where the exploit occurs (i.e. the address "0808B308"): not bad!

Note that the last instruction reported here (the RETN between parentheses) is not logged by the Pintool because a crash happened right after its execution... but...


As I said before, this exploit makes use of Heap Spraying. In particular, we can see it by debugging Adobe Acrobat Reader while Pin is not instrumenting it and setting a breakpoint on address "0808B308". Now, if we open the PDF exploit and leave the debugger running, we can inspect the memory when the code hits the breakpoint:

This is exactly what we were expecting: you can notice the ROP shellcode at "0c0c0c0c" and the Heap Spraying all around. On the other hand, if we debug the Adobe Acrobat Reader while Pin is instrumenting it, we obtain:

So... no ROP, nor Heap Spraying... but the blocks of memory are still allocated. Who has allocated them?
To get the answer we need to look inside the code window:

... It's Pin itself!
Pin allocates a lot of memory to perform binary instrumentation, occupying also the addresses usually employed by the Heap Spraying. This means that when the ROP shellcode is executed, it's not located where it is supposed to be and this will result in Adobe Acrobat Reader crashing.

Another problem I ran into, is that even when I modified the Pintool in order to force the exploit to work with the shellcode that was placed at a different address than 0x0C0C0C0C, the exploit still crashed.
This time I could see it run all the ROP shellcode, which allocates a block of executable memory, copies itself to it and then jumps to it.

However, this executable shellcode (not ROP) tried to decrypt (and therefore overwrite) itself causing a memory access violation and making the instrumented shellcode crash. 

I haven't investigated the problem yet, but it seems that the instrumented shellcode is placed in an area that is read only, therefore the self decryption failed when writing the decrypted bytes back to the shellcode memory. 


  1. This is really cool. What are your thoughts about also detecting effective ROP without using ret, but calls/jmps instead?Like in

  2. Thank you for the feedback! :) That's a good point, in the case of JOP the detection technique would be very different... maybe I will write a dedicated entry about it ;)

  3. How did you get all the opcodes for the call instruction in the OpCheck array?

  4. I encountered them in the binaries I tested, but you can obtain the whole set of opcodes from the Intel's manuals :)

  5. Hi, interesting article, thanks. Did you still find that memory layout was disturbed when you enable -separate_memory 1 in pin?

    I'm not sure how well that's going to work in a pathological case like a heap spray, but would be interested to know.