📚 OpenRCE is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.








Flag: Tornado! Hurricane!

 Forums >>  Brainstorms - General  >>  A database for intrinsic functions

Topic created on: June 27, 2005 23:48 CDT by hoglund .

This is jsut an idea, but it might be useful to put into the OpenRCE database a list of functions that can be instrinsic to a binary, for example, strcmp, strtok, etc.  Often times these can be identified using a simply byte match, but a database of these signatures might be useful.  It might help people who have utilities to 'dress' a bianry - similar to Flirt from Datarescue I guess.  If someone knows of an already compiled database of such a thing pls let me know :-)

  hoglund     June 27, 2005 23:50.26 CDT
and, a way for me get back to my post an edit all my fat-fignreing would be nice too :-)

  drew     June 28, 2005 00:04.27 CDT
Open source, user-contributed FLIRT signatures perhaps?  Not a bad idea.
And an obligatory FLIRT link:
http://www.datarescue.com/idabase/flirt.htm

  dennis     June 28, 2005 00:10.41 CDT
I think J.C.Roberts' IDB2PAT plugin would be a great help.

  JCRoberts     June 28, 2005 02:09.44 CDT
Funny, just yesterday Pedram was asking me about the usefulness of IDB2PAT for building a library of malware function signatures to ease the analysis of variants.

A limitation/feature of IDA signatures is function length. When you've got lots of very short functions, the odds of a pattern collision are greatly increased (i.e. two functions with the same signature).

Ilfak made a (wise) design decision when creating the FLIRT/FLAIR features in IDA. It only uses the first "X" bytes of a function (I've forgotten what "X" is but could probably look it up), so the resulting signatures are truncated. This adds to the collision problem (i.e. two functions that start off the same way but end differently) and decreaces accuracy somewhat but on the bright side, it greatly increases speed of application and greatly decreases the storage space required to hold all the signatures. If I remember correctly, the odd side effect of truncation and decreased accuracy can compensate for the uses of various compile time optimizations.

When you're dealing with libraries (and complete signatures of *EVERY* byte in them), the trade off of speed versus storage space (multiple GiB) is worth it.

Though I've never personally used it, in the commercial BinDiff plugin from Sabre-Security, Halvar uses a novel approach to reach similar identification ends, namely, graph theoretic analysis. For the pedantically challenged, "graph theoretic analysis means building a signature of the function flows.  A painful over-simplification would be if Myfunct() calls strcmp three times then calls strlen twice in executable #1, the odds are if I find a similar flow in executable #2, in might be the same function. Some forms of conditional execution can cause trouble for this method, but you can still get a reasonable degree accuracy, on the other hand, it would work like a dream on executables with tons of tiny functions.

The real questions to answer are, "how accurate/detailed do you want the identification to be?" and "what kind of compute/storage resources will you require?" Once you have a disassembly with function start/end info, the problem falls into the world of "embarrassingly parallel" and could be run on a cluster.

JCR

  2GooD     June 29, 2005 15:01.12 CDT
Do you by "instrinsic" function mean "inlined"?

(Btw Greg, your surname sounds very Swedish if I change the 'o' to an '

  corner640     August 23, 2010 03:39.35 CDT
trying to get sth. useful.

Note: Registration is required to post to the forums.

There are 31,328 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06
Question about memor...
Dec/12


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit