OpenRCE

📚 OpenRCE is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.

About Articles Book Store Distributed RCE Downloads Event Calendar Forums Live Discussion Reference Library RSS Feeds Search Users What's New

Customize Theme

Flag: Tornado! Hurricane!

Created: Monday, September 29 2008 01:39.00 CDT

Modified: Wednesday, October 1 2008 07:01.08 CDT

This is an imported entry. View original.

Printer Friendly ...

Improving Binary Comparison (and its implication for malware classification)

Author: halvar

# Views: 2548

I am at Virus Bulletin in Ottawa -- if anyone wants to meet to see our new stuff, please drop mail to [email protected] ! :)

It has been a while since I posted here -- partially because I had a lot of work to finish, partially because, after having finished all this work, I took my first long vacation in a ... very long while.

So I am back, and there are a number of things that I am happy to blog about. First of all, I now have in writing that I am officially an MSc in Mathematics. For those that care about obscure things like extending the euclidian algorithm to the ring of boolean functions, you can check the thesis here:
http://www.zynamics.com/files/Diplomarbeit.Thomas.Dullien.Final.pdf

For those that are less crazy about weird computational algebra: Our team here at zynamics has made good progress on improving the core algorithms behind BinDiff further. Our stated goal was to make BinDiff more useful for symbol porting: If you have an executable and you suspect that it might contain a statically linked library for which you have source access (or which you have analyzed before), we want BinDiff to be able to port the symbols into the executable you have, even if the compiler versions and build environments differ significantly, and even if the versions of the library are not quite the same.

Why is this important ? Lets say youre disassembling some piece of network hardware, and you find an OpenSSL-string somewhere in the disassembled image. Lets say youre disassembling an old PIX image (6.34 perhabs) and see the string

OpenSSL 0.9.5a 1 Apr 2000

This implies that PIX contains OpenSSL, and that the guys at Cisco probably backported any fixes to OpenSSL to the 0.9.5a version. Now, it would be fantastic if we could do the following: Compile OpenSSL 0.9.5a with full symbols on our own machine, and then "pull-in" these symbols into our PIX disassembly.

While this was sometimes possible with the BinDiff v2.0 engine (and v2.1, which is still essentially the same engine), the results were often lacking in both speed and accuracy. A few months back, Soeren and I went back to the drawing board and thought about the next generation of our diffing engine -- with specific focus on the ability to compare executables that are "far from each other", that differ significantly in build environments etc. and that only share small parts of their code. The resulting engine (dubbed "DiffDeluxe" by Soeren) is significantly stronger at this task.

Why did the original BinDiff v2 engine perform poorly ? There are a number of reasons to this, but primarily because of the devastating impact that a "false match" can have on further matches in the diffing process, and due to the fact that in the described scenarios, most of the executable is completely different, and only small portions match. The old engine had a tendency to match a few of the "unrelated components" of each executable, and these initial incorrect matches led to further bad matching down the road.

This doesnt mean the BinDiff v2 engine isnt probably the best all-round diffing engine you can find (I think it is, even if some early builds of the v2.0 suffered from silly performance issues -- those of you that are still plagued by this please contact support@ for a fix !) -- but for this particular problem some old architectural assumptions had to be thrown overboard.

Anyhow, to cut a long story short: While the results generated by DiffDeluxe arent perfect yet, they are very promising. Lets follow our PIX/OpenSSL scenario:

DiffDeluxe operates with two "fuzzy" values for each function match: "Similarity" and "Confidence". Similarity indiciates how successful the matching algorithm was in matching basic blocks and instructions within the two functions, and confidence indicates how "certain" DiffDeluxe is that this match is a correct one. This is useful to sort the "good" and "bad" matches, and to inspect results before porting comments/names. Anyhow, lets look at some high-confidence matches:

Well, one doesnt need to be a rocket scientist to see that these functions match. But in many situations, the similarity between two functions is not 100% evident: The following is a matched function with only 72% similarity (but 92% confidence):

So what is the overall result ? Out of the 3977 functions which we had in libcrypto.so, we were able to match 1780 in our Pix disassembly -- but with a big caveat: A significant number of these have very low similarity and confidence scores. This isnt surprising: The differences between the compiler used upon compile time of our Pix image (sometime 6 years ago ?) and the compiler we used (gcc 4.1, -O3) is drastic. All in all, we end up with around 250 high-confidence matches -- which is not too bad considering that we dont know how many functions from OpenSSL the Pix code actually contains.

In order to have a more clear idea of how well these algorithms perform, we need an example of which we know that essentially the entire library has been statically linked in. For this, luckily, we have Adobe Reader :-)

With all the Adobe patches coming up, lets imagine wed like to have a look at the Javascript implementation in Acrobat Reader. It can be found in Escript.api. Now, I always presume that everybody else is as lazy as me, so I cant imagine Adobe wrote their own Javascript implementation. But when Adobe added Javascript to Acrobat Reader, there were few public implementations of Javascript around -- essentially only the engine that is nowadays known as "SpiderMonkey", e.g. the Mozilla Javascript engine. So I compiled SpiderMonkey into "libjs.so" on my Linux machine and disassembled Escript.api. Then I ran DiffDeluxe. The result:

Escript contains about 9100 functions, libjs.so contains about 1900. After running the diff, we get 1542 matches. Lets start verifying how "good" these matches are. As discussed above, DiffDeluxe uses a "similarity" and "confidence" score to rate matches. We get 203 matches with similarity and confidence above 90% -- for these functions, we can more or less blindly assume the matches are correct. If we have any doubts, we can inspect them:

Well, there is little question that this match was accurate.

The interesting question is really: How low can we go similarity- and confidence-wise before the results start deteriorating too badly ? Lets go low -- for similarities below 40%. For example the js_ConcatStrings match.

Manual inspection of the screenshot on the right will show that the code performs equivalent tasks, but that hardly any instructions remain identical.

Proceeding further down the list of matches, it turns out that results start deteriorating once both confidence and similarity drop below 0.3 -- but we have around 950 matches with higher scores, e.g. we have successfully identified 950 functions in Escript.api. While this is signifcantly less than the 1900 functions that we perhabs could have identified, it is still pretty impressive: After all, we do not know which exact version of SpiderMonkey was used to compile Escript.api, and significant changes could have been made to the code.

Clearly, were a long way from matching 95% -- but were very close to the 50% barrier, and will work hard to improve the 50% to 75% and beyond :-)

Anyhow, what does all this have to do with automatic classification and correlation of malware ?

I think the drastic differences induced by platform/compiler changes make it pretty clear that statistical measures that do not focus on the structure and semantics of the executable, but on some "simple" measure like instruction frequencies, fail. All the time. Behaviorial methods might have a role to play, but they will not help you one bit if you acquire memory from a compromised machine, and are trivially obfuscated by adding random noisy OS interaction.

I am happy to kill two birds with one stone: By improving the comparison engine, I am making my life easier when I have to disassemble Pix -- and at the same time, I am improving the our malware classification engine. Yay :-)

Anyhow, as mentioned above: I am at the Virus Bulletin conference -- if anyone wishes to have a chat or have our products demoed, please do not hesitate to send mail to [email protected].

If you wish to comment on this blog entry, please do so on the original site it was imported from.

There are 31,328 total registered users.

Recently Created Topics
[help] Unpacking VMP...	Mar/12
Reverse Engineering ...	Jul/06
let 'IDAPython' impo...	Sep/24
set 'IDAPython' as t...	Sep/24
GuessType return une...	Sep/20
About retrieving the...	Sep/07
How to find specific...	Aug/15
How to get data depe...	Jul/07
Identify RVA data in...	May/06
Question about memor...	Dec/12

Recent Forum Posts
Finding the procedur...	rolEYder
Question about debbu...	rolEYder
Identify RVA data in...	sohlow
let 'IDAPython' impo...	sohlow
How to find specific...	hackgreti
Problem with ollydbg	sh3dow
How can I write olly...	sh3dow
New LoadMAP plugin v...	mefisto...
Intel pin in loaded ...	djnemo
OOP_RE tool available?	Bl4ckm4n

Recent Blog Entries
	halsten	Mar/14
Breaking IonCUBE VM
	oleavr	Oct/24
Anatomy of a code tracer
	hasherezade	Sep/24
IAT Patcher - new tool for ...
	oleavr	Aug/27
CryptoShark: code tracer ba...
	oleavr	Jun/25
Build a debugger in 5 minutes
More ...

Recent Blog Comments
	nieo on:	Mar/22
IAT Patcher - new tool for ...
	djnemo on:	Nov/17
Kernel debugger vs user mod...
	acel on:	Nov/14
Kernel debugger vs user mod...
	pedram on:	Dec/21
frida.github.io: scriptable...
	capadleman on:	Jun/19
Using NtCreateThreadEx for ...
More ...

Imagery

SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit