📚 OpenRCE is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.








Flag: Tornado! Hurricane!

 Forums >>  Brainstorms - General  >>  Halvar's Wishlist

Topic created on: August 2, 2006 17:59 CDT by drew .

At BlackHat this morning Halvar talked about a handful of reverse engineering ideas on his wishlist:

#0 have the tool tell you what a function/code block may return
#1 more data structure analysis, diagram how structures relate to each other
#2 reconstruct class info
#3 group functions into modules.  perhaps write an IDA plugin to enable manual grouping
#4 recover template info.
#5 generate input that will reach a specific code location
#6 automate analysis of translation-and-emulation protection schemes
#7 reduce code blocks to normal form
- use to defeat most polymorphic engines
#8 add the concept of order to callgraphs, so you can see what order functions are called in
#9 get info on the order functions must be called in.
- i.e. connect() before send() before close()
#10 semantic-based signatures/descriptions of libraries

I might take a crack at a few of them.  I'll probably start off with creating a framework in IDA for #3 (grouping functions into modules).

Anyone else interested in working on any of his ideas?

  aeppert     August 2, 2006 18:56.55 CDT
Absolutely.  I already have a start on a few.

Aaron

  drew     August 2, 2006 19:21.37 CDT
Awesome.  Which ones are you working on, or worked on?  I know of a few commercial and private tools that provide coverage of some of the ideas.  HBGary's tools come to mind.

I also left one idea off...

#Other: integration and communication between RE tools.

This is a big one.  Pedram's had grand visions of this for a while and PaiMei is a big step there.  Halvar mentioned that he's developing a database standard for RE tools to integrate together.  Maybe we can convince him to post some of his details and thoughts on the standard.

  aeppert     August 2, 2006 19:24.57 CDT
#10 - sadly nothing for release as I am still debating how to best utilize it.

Plus, the integration and communication between tools have some ideas in mind and some code working in this direction.  Same goes for this, debating a SABRE-esque situation here.

(Incidently, will you be over at the Hard Rock tonight?)

  igorsk     August 2, 2006 19:49.00 CDT
I've been sort of working on #2... Some of it will be in the part 2 or my article when/if I finish it -_- (the working title is "RTTI and classes recovery")

  luis   August 2, 2006 21:02.43 CDT
I've also been working on #2.
By using runtime analysis you can feed results back into the static analysis.

You find vtables and set a breakpoint on all methods. Once a method breakpoint is hit, a xref is added in IDA.
If a vtable contains pure_virtual (sp?) functions it is obviously a parent class.
If 2 vtables point to the same method at the same vtable offset they are related.

In ollydbg when a register points to a string, ollydbg dereferences the string and displays it. Type information like strings would be useful for objects.
One option would be to set breakpoints at the exit of constructors, keeping a table of memory addresses for all new objects. Similarly the entry of destructors would have breakpoints in order to remove objects from this memory table.

A second option would be to check whether the first dword of an object points a vtable and use that as the type.

I haven't implemented any of these ideas yet, as they come from my re_ideas.txt file.

If I have time I'll expand what I have on my blog.

  pedram     August 2, 2006 21:57.50 CDT
> This is a big one.  Pedram's had grand visions of this for a while and PaiMei is a big step there.  Halvar mentioned that he's developing a database standard for RE tools to integrate together.  Maybe we can convince him to post some of his details and thoughts on the standard.

I have a lot of development still planned for PaiMei and a few excited contacts who wish to contribute. Hopefully it can continue taking the necessary steps towards this goal. Part of the solution requires increased peer review and awareness. For example, today sitting in Sherri's SideWinder talk, I immediately thought of how perfect PaiMei is for building the tool they demo-ed. All the required functionality is there:

- debugger with breakpoint capabilities
- static disassembly with basic block enumeration
- graph manipulation (to locate the paths between start/end as well as the "rejection" nodes)
- code coverage tracking, for the fitting routine
- real-time graphing

I'm going to sit with Sherri tomorrow and will be curious to hear her thoughts.

Regarding the schema, Ero has given me a run down on it and from what I understand it will indeed be released soon. I plan on migrating PIDA over to the SQL schema and making data access on demand vs. at startup to solve the massive memory footprint issue. The interface to PIDA as far as scripting is concerned will not change.

  tagetora   August 3, 2006 03:08.19 CDT
I have been dealing with #3 using name convention, things like ModuleName_FuncName. It's not a nice way to do it (oh, yes, it's horrible) so I'm looking for new methods.

I used the name convention because I usually find code where the modules weren't mixed up, I mean:

[ Code for Module A ] : range 0x00400200 - 0x00401A00
[ Code for Module B ] : range 0x00401A70 - 0x00405000
...

so all the crypto funcs are in the same "block", and the same for sockets/comm, GUI, config handling, etc... and with the naming, it's easy to browse with IDA and see where a module seems to begin or end.

Well, if my help is appreciated, just "take it" ;D

  sp     August 3, 2006 07:00.49 CDT
> #0 have the tool tell you what a function/code block may return

What does that mean? Was he talking about types of return values or sets of possible return values? I've got something like this planned.

> #2 reconstruct class info

He's been talking about that since at least 1999 but apparently the problem is non-trivial. ;)

> #3 group functions into modules.  perhaps write an IDA plugin to enable manual grouping

I wonder if IDA enforces a linear disassembly. If you can move around blocks of code this might be relatively easy to do. If not you can at least create a window which allows "virtual" grouping (you basically have a tree structure you can put functions in). This shouldn't be too hard.

I'd really like to see an easy-to-use code coverage plugin for IDA which produces nice HTML output or something. Maybe you're familiar with the BullsEye code coverage tool for C++. I'd like to see something like that for IDA. I'm aware of Ilfaks's code coverage plugin but unfortunately it produces a whole lot of weird errors here and basically doesn't work. I wanted to shoot him an email soon to ask him about that.

  Piotr     August 3, 2006 09:07.46 CDT

I'm using emulation and couple of other things :)
Not only for security research, also for some _evil_ purposes :)

  RolfRolles     August 3, 2006 14:07.30 CDT
> tagetora: I have been dealing with #3 using name convention, things like ModuleName_FuncName. It\'s not a nice way to do it (oh, yes, it\'s horrible) so I\'m looking for new methods.

That's how I do it too, manually, and it's the same format that my modularization plugin uses to name the functions.  Given that I wrote this and gave it to Halvar last year, I'm puzzled as to why it's on his list.

  tthtlc     August 4, 2006 04:10.00 CDT
For #3, it would be helpful to produce some kind of a visualization tool:

http://kernelmapper.osdn.com/map.php?x=0&y=0&zoom=4

And several others are available - just go to google images and search "windows kernel".

The logic here is that if several functions are calling each other, then most likely they are implementing some kind of a common function.

But if the function is called almost randomly from everywhere in the program or system, then most likely it is a common function.

Another idea is to identify all functions that called Win32 APIs.   Since Win32 APIs are implementing some well-known documented function, all the caller's functionality can be immediately be guessed/deduced - to a certain degree.   And most likely THESE are the common function - for example, take a look at mshtml.dll.   Then there will be another class of functions that completely does not call Win32 API, but called the above common functions instead.   These are the functions that implement PART OF THE modules.

And as for identifying the HIGH LEVEL MODULES itself - it is easy - just look at the MAIN LOOP at the start, or take a look at the GUI main loop handler - all the highest level handler for mouse clicks usually implement a particular FEATURE itself, which in our case correspond to A MODULE OR FUNCTIONALITY.   For example, traversing Opera's GUI->Feeds->ReadFeeds--> we immediately can identify the START OF THE CHUNK that implement the ReadFeed() functionality.   And IDA Pro plugin should name the function as thus.

So the entire concept of function grouping is based on identifying the LINKS - either MAIN link, or common function link, or win32-API related link (each link of which can be identified with the public documentation of the Win32 API itself) etc.   And THE IMPORTANT IDEA IS THIS:  JUST PURELY BY DRAWING LINKS ALONE - who calling who - you immediately can see there exists grouping of functions - with the library function characteristically HAVING MANY INCOMING LINKS, but NO/ZERO OUTGOING LINKS.

Eg, if a function called Win32 OpenFile(), and no other function, then I will name the function as MyOpenFile().

This lead to another idea - which is the number of incoming/outgoing links.   Normally a functions with MANY OUTGOING LINKS ARE THE MAIN FUNCTION - for example, GUI event handler.   And for those with MANY INCOMING LINKS, ESPECIALLY RANDOMLY BEING CALLED BY EVERYONE, is a general function.   But if it is called frequently only from a specific number of functions, then it is a special library function.    

Hope these sound logical or insane?

  Clandestiny   August 8, 2006 09:37.11 CDT
> pedram: > This is a big one.  Pedram\'s had grand visions of this for a while and PaiMei is a big step there.  Halvar mentioned that he\'s developing a database standard for RE tools to integrate together.  Maybe we can convince him to post some of his details and thoughts on the standard.
>
> I have a lot of development still planned for PaiMei and a few excited contacts who wish to contribute. Hopefully it can continue taking the necessary steps towards this goal. Part of the solution requires increased peer review and awareness. For example, today sitting in Sherri\'s SideWinder talk, I immediately thought of how perfect PaiMei is for building the tool they demo-ed. All the required functionality is there:
>
> - debugger with breakpoint capabilities
> - static disassembly with basic block enumeration
> - graph manipulation (to locate the paths between start/end as well as the \"rejection\" nodes)
> - code coverage tracking, for the fitting routine
> - real-time graphing
>
> I\'m going to sit with Sherri tomorrow and will be curious to hear her thoughts.
>

Hiya Pedram,

Sorry we didn't manage to catch up at Black Hat. I've d/l Pai Mei and am very excited about the possiblities of porting our tool to your development platform. I'm in the process of installing the required tools.  What version of IDA do I need to generate the PIDA files? I think I'm using 4.8 right now. After I get everything set up, I'll be in touch via email :)

Sherri

  pedram     August 8, 2006 15:57.31 CDT
Regarding the version of IDA: I believe 4.9+ is required due to a call to is_retn_insn() (or something along those lines). In the next release I'll replace that call with one an x86 equivalent so that people such as yourself, who have yet to update don't get an error.

IDA 5 has an awesome graph interface though, you should upgrade ;-)

Note: Registration is required to post to the forums.

There are 31,328 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06
Question about memor...
Dec/12


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit