📚
OpenRCE
is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.
About
Articles
Book Store
Distributed RCE
Downloads
Event Calendar
Forums
Live Discussion
Reference Library
RSS Feeds
Search
Users
What's New
Customize Theme
bluegrey
blackgreen
metal
simple
Flag:
Tornado!
Hurricane!
Login:
Password:
Remember Me
Register
Blogs
>>
tagetora
's Blog
Created: Tuesday, January 23 2007 13:20.35 CST
Modified: Tuesday, January 23 2007 13:51.26 CST
Printer Friendly ...
Function Search Engine (or whatever)
Author:
tagetora
# Views:
1599
Well, here is the first draft about how will work the "Function Search Engine". BTW, I'm looking for a name for this thing (something better that FSE or FUSEN). Okay, here we go!
Concept
--------------
The funtion matching will be using different levels of filters. I mean, there will be a group of first-level filters who will compound the core of the search engine. Also, the 2nd, 3rd... level filters will be used to "refine" the results. Every filter will be taking care of a function attribute, a set of attributes or will represent a relation between parts of the function (here is where we will face with some graph analysis and/or some other abstract representation of the function).
I build a first draft about what goes on every filter-level. It may be considered pre-alpha xD because some attributes will flow from one level to another, new attributes may appear and others may disapear, based in the tests done.
First level filters:
- Set of external references.
- Set of internal constant values.
- Stack size.
- Abstract graph representation
Second level filters:
- Number of parameters.
- Set of types (parameters and/or local variables).
Brief description of main filters
----------------------------------------------------
External References: The data used by the function (strings, constant values, data blocks such as a vector or a matrix, floating point numbers, etc.). The data a function uses says a lot about its functionality. Sometimes a simple Internet search save us a lot of time.
Constant values: Same thing as above but taking care of common used constants like 0xFFFFFFFF, 0x0, etc. that will include a lot of noise in the system. Also this references only constant values that are embeded in the function, I mean, they aren't outside the function's code.
Stack size: Just look at the function preamble (push-mov-sub). It will tell us about local variables size. I know that the same function can use static buffers or allocate them, I'm trying to get the best way to manage this.
Abstract graph: This is the "bindiff-like" graph representation of the function. It will be the hardest part of the implementation process, so we will focus on it once we have a good general desing. I've been talking with the
radare
developer about this, because he wants to add a graphical diff to his tool. Maybe we can get a base desing in a couple of weeks (I know, my time estimations are not quite accurate, but...)
Testing
--------------
Regarding the tests, at the begining we will be "good boys" and will use short and (of course) well known funcions, but that are very similar. Let's think in what makes different toupper() from tolower(), strcpy() and strncpy(). At the same time, trying that different implementations of these functions match each together.
For the final tests, it will be interesting to use the same function but compiled for different architectures (if the system finally supports it).
That's all folks!
-------------------------
As a final word, only talk about a concept I have been discussing with a friend. We were talking about the abstract graph representation and why not to use a "Divide and Conquer" desing, so we began to discuss about to use all this filters and techniques but with "basic function blocks" instead of taking the entire function. With this approach, we can ask ourselves: the definition of the basic blocks behaviour defines the global function behaviour? I think it's a nice question, but too hard to answer at this moment... or maybe not ;D
Using this "basic block" approach we face with new problems like how to define what's a "basic block", how this blocks change using different compiler flags, if we need to ignore certain types of blocks (like heap init or clean blocks), etc... Well, while I was writing this I see that Ero posted another interesting entry in his blog about "basic blocks" ;D
"I have promises to keep,
And miles to go before I sleep."
Add New Comment
Comment:
There are
31,328
total registered users.
Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06
Question about memor...
Dec/12
Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n
Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM
oleavr
Oct/24
Anatomy of a code tracer
hasherezade
Sep/24
IAT Patcher - new tool for ...
oleavr
Aug/27
CryptoShark: code tracer ba...
oleavr
Jun/25
Build a debugger in 5 minutes
More ...
Recent Blog Comments
nieo
on:
Mar/22
IAT Patcher - new tool for ...
djnemo
on:
Nov/17
Kernel debugger vs user mod...
acel
on:
Nov/14
Kernel debugger vs user mod...
pedram
on:
Dec/21
frida.github.io: scriptable...
capadleman
on:
Jun/19
Using NtCreateThreadEx for ...
More ...
Imagery
SoySauce Blueprint
Jun 6, 2008
[+] expand
View Gallery
(11) /
Submit