📚 OpenRCE is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.








Flag: Tornado! Hurricane!

 Forums >>  IDA Pro  >>  Path Optimization for Partial Deobfuscation

Topic created on: February 3, 2008 19:46 CST by daniellewis .

So I was contemplating the task of deobfuscation of code and realized, as many of you probably do, that it's a function of optimization.

If one minimizes the code path, the algorithm becomes clearer, faster and smaller.  This is still rather vague to me, so please excuse my verbosity.

While lots of compilers try to optimize, the real solution for all intents and purposes really is to optimize binary forms.

If one were to trace through the graph as the program is running and generate an output instruction for each new one run (regardless of where it came from in memory); and then minimize it's nodes by appending nodes where they're only entered from one point that would be a helpful algorithm.

The next step would be to remove unnecessary instructions.  Trace the graph backwards.  For any change in memory or code flow, accept each instruction that eventually affects that change in memory or code flow.  You do this backwards so that:

mov al, 3
and cl, al
jmp $+cl

will accept the "mov al, 3" because it affects cl which affects the jump.  To do this, you can store a bit for each register and another for each flag to mark it affecting.

This would optimize out purely unnecessary instructions, without considering possible alternatives.  It would still be major breakthrough, as you could take ANY program and generate the simplified code flow without junk instructions.

Thoughts?

  jms     February 3, 2008 22:45.45 CST
Sounds sweet, but implementing it may be a task. :)

  daniellewis     February 5, 2008 20:01.26 CST
Yeah, I'm already working on writing the smallest, fastest javascript interpreter in the world.  I'd chalk that up to also being a significant task.

I think this is a greater priority in terms of it's usefulness, but I'm already half way in and my brain is full of solutions for it I need to put down.

If nobody else figures it out before I'm done, this will probably be my next project.

I gave it some thought though, and you'd need some user input sometimes; like what do you do with "jmp eax" when eax isn't clearly defined?  

IDA's answer: ask the user.  Probably a good bet.

Regards,
Dan

  daniellewis     February 7, 2008 18:25.59 CST
To follow up with myself, in my own little thread all by myself...

I've since written a table-driven flat disassembler and assembler almost equivalent to ndisasm in JavaScript.  Probably would have been better to just retool Bochs, but I don't know how.

  PSUJobu     February 8, 2008 06:20.47 CST
At this rate you'll have an IDA Pro replacement by next week!  ;-)  Just kidding, of course, but your project sounds interesting...

  daniellewis     February 11, 2008 23:25.01 CST
The problem with IDA Pro's analyzer is that it fixates on the original linked layout as the "one you want".  I'm more interested in the state of the instructions at the moment they get executed, and the state of the data at the moment it gets touched.

"just a matter of" logging EIP every instruction, and sometimes pointer locations, and then mapping them to a cleaner layout.

  NicoDE     February 19, 2008 14:59.22 CST
> daniellewis:
> The next step would be to remove unnecessary instructions.
> [...]
> mov al, 3
> and cl, al
> jmp $+cl

"mov al, 3" is necessary if AL is used by (at least one) code sequence that the JMP might execute.
That's the point where it starts getting complicated.

Due to the fact that input data is often hard to predict...
It might end up with only some unnecessary instruction removed that are not of interest for the code flow.

Another point is code/behavior that is not implemented in the target itself (e.g. SEH).


Just my 3 cents :)

Note: Registration is required to post to the forums.

There are 31,328 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06
Question about memor...
Dec/12


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit