Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Friday, November 28 2008 01:34.12 CST Modified: Friday, November 28 2008 03:06.55 CST
Printer Friendly ...
Switch as Binary Search, Part 1
Author: RolfRolles # Views: 8172

We have seen how a compiler-generated binary search partitions the range of integers into successively smaller intervals and ultimately converging upon either a single value or one contiguous range of values.  We will now develop a tool to deal with this construct in a friendly fashion for the working reverse engineer.  

The tool has the following high-level specification:

* Given the first address of a compiled binary switch construct:
* * Determine the register used and whether the comparisons are signed (trivial)
* * Determine all terminal labels, at which cases reside, and all values that lead to these cases.
  
A discussion of the problem and its solution follows.

Upon entering the switch, the register value is entirely unconstrained:  the range of values that it might have are contained in the interval [0, MAX_INT] (or [-(MAX_INT+1)/2, MAX_INT/2] for signed integers).  The following example uses an unsigned integer.

Suppose that the first comparison is of the form

@loc0:
cmp eax, val1
jbe @loc1
@loc2:

If the jump is taken, the value must be below or equal to 'val1'.  In other words, it lies in the range [0, val1].  If the jump is not taken, the value must be above 'val1', so it lies in the range [val1+1, MAX_INT].  Therefore, each comparison in the tree partitions the range of possible values into two disjoint ranges:  one for those values where the jump is taken, and one for those where it is not.

At the second level of the tree, there will be two comparisons:  one for the case where the jump for the first was not taken, and one for the case where it was.  Consider the second set of such instructions:

@loc1:
cmp eax, val2
jbe @loc3
@loc4:


@loc2:
cmp eax, val3
jbe @loc5
@loc6:


Each of these cases further constrains the input:

* The range of values leading to loc3 is [0,         val2]
* The range of values leading to loc4 is [val2+1,    val1]
* The range of values leading to loc5 is [val1+1,    val3]
* The range of values leading to loc6 is [val3+1, MAX_INT]

This table has a very regular structure, and it should not be too hard to imagine what it would look like for three levels into the tree, or four, or...  The following image summarizes the process.  The unique path to a given vertex specifies the constraints required for input to reach it.



As we walk the comparisons in the tree in this fashion, the binary tree will eventually stop and give way to the code for the cases in the switch.  At this point, our constraints will be terminal cases of either single values (most commonly) or of simple ranges.  For each terminal address, we maintain a dictionary associating it with the corresponding ranges that lead there.

The following pseudocode codifies the discussion above.

// The first range returned is that for which the jump is taken;
// the second is for non-jump-taking values
partition(ea, low, high, compared)
{
  if(comparison at ea is ">")
    return [compared+1, high], [low, compared]

  if(comparison at ea is ">=")
    return [compared, high], [low, compared+1]

  if(comparison at ea is "<")
    return [low, compared-1], [compared, high]

  if(comparison at ea is "<=")
    return [low, compared], [compared+1, high]

  if(comparison at ea is "==")
    return [compared, compared], [low, high]

  if(comparison at ea is "!=")
    return [low, high], [compared, compared]
}


The following recursive algorithm solves the general problem.

analyze_bswitch(ea_t ea, int low, int high, int compared)
{
  if(ea is "cmp reg, constant")
    compared = instruction's immediate value
    ea = address of next instruction
  
  if(ea is a conditional jump)
    [low1,high2], [low2,high2] = partition(ea, low, high, compared)
    analyze_bswitch(jump taken ea, low1, high1, compared)
    analyze_bswitch(jump not taken ea, low2, high2, compared)
  
  // Instruction is a leaf in the binary tree
  else
    associate(ea, low, high)
}


IDAPython-1.0 compatible source code is available here.

The resulting disassembly is properly annotated with case labels, as IDA does normally:

AUTO:0046348E   jbe     loc_463D16      ; case 0A0h
AUTO:00463494   cmp     eax, 0A2h
AUTO:00463499   jb      loc_462FA0      ; case 0837F81BAh, 0837F81D8h, 0A1h, 091h
AUTO:0046349F   ja      loc_463D21      ; case 0837F90BAh, 0A3h
AUTO:004634A5 loc_4634A5:               ; CODE XREF: sub_462120:loc_462EDAj
AUTO:004634A5                           ; sub_462120+DDDj ...
AUTO:004634A5   push    ebx             ; case 0837F8106h, 0837F8124h, 0A2h, 031h
AUTO:004634A6   call    sub_45C630
AUTO:004634AB   jmp     loc_462217




Add New Comment
Comment:









There are 31,314 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
hi!
Jul/01
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit