Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Friday, December 25 2009 13:45.13 CST  
Printer Friendly ...
Code release: C-subset compiler in Objective Caml
Author: RolfRolles # Views: 7786

Here is the source code for a compiler that I wrote in Objective Caml this semester, for a subset of the C language.  It requires a standalone C->IR translator which is not included in this release, as the school owns the copyright on that particular piece of code.  Hence one cannot immediately use this compiler to compile C programs without writing a C front end; anyway, an existing compiler such as MSVC or GCC would be a better choice.

The portion of the code that I wrote (everything except bitv.ml and .mli) totals roughly 3200 lines of code.  It includes two optimizations based on classical data-flow analysis, constant propagation and dead-statement elimination.  It also supports translation into and out of static single assignment form, as well as two optimizations based on SSA:  constant propagation and loop-invariant code motion.  The code for the graph-coloring register allocator is not included.  As a back-end, the compiler produces C code that can be compiled by any other C compiler.

The code is a pretty good example of how to structure medium-sized programs in functional programming languages.  I tried to adopt a pure functional style throughout most of the codebase.  Sometimes this failed:  cfg_ir.ml is unnecessarily ugly, and should have been re-written in an imperative style with mutability; also, I made the mistake of using a mutable array to describe Phi values in static single assignment whereas pure functional style would have dictated a list.  But those are my only complaints with the code; overall, I'm pretty pleased with how it all turned out.

This code is substantially more sophisticated than the compiler that I wrote to break VMProtect, so if you can read and understand this release, you should be in good shape for breaking virtualization obfuscators.


Blog Comments
qaysel Posted: Sunday, January 3 2010 19:37.12 CST
Thanks, it's very interesting and educational to me. I got it to build after installing ocamlgraph and fudging a bit with the Makefile. The source is quite readable, and with the sample code at http://txt.pastebin.com/f13254815 I'm able to understand the IR syntax. Thanks again.

RolfRolles Posted: Tuesday, January 5 2010 15:19.49 CST
Good point -- I have re-uploaded the package and included an "ir" directory with a collection of samples of C->IR translations.  Thanks.



Add New Comment
Comment:









There are 28,229 total registered users.


Recently Created Topics
Reverse Engineering ...
Jan/23
Career: DoD Agency I...
Jan/22
"Disappearing&q...
Jan/17
Career: Software Sec...
Jan/11
Where is the call st...
Jan/07
IDA Pro 6.1 Breakpoi...
Jan/01
How to create data s...
Dec/30
can i search all mod...
Dec/23
IDA symbol table exp...
Dec/20
An anti-attach trick
Dec/17


Recent Forum Posts
Reverse Engineering ...
NirIzr
"Disappearing&q...
NirIzr
Reverse Engineering ...
charlie
"Disappearing&q...
charlie
An anti-attach trick
Bass
An anti-attach trick
waleeda...
An anti-attach trick
Bass
An anti-attach trick
waleeda...
An anti-attach trick
Bass
Looking for value in...
NirIzr


Recent Blog Entries
cmathieu
Feb/07
Hacker Carnival

waleedassar
Feb/06
OllyDbg v1.10 And Hardware ...

waleedassar
Jan/31
Yet Another Anti-Debug Trick

RolfRolles
Jan/22
Finding Bugs in VMs with a ...

waleedassar
Jan/13
An OllyDbg Bug Disables Sof...

More ...


Recent Blog Comments
waleedassar on:
Feb/07
OllyDbg v1.10 And Hardware ...

NirIzr on:
Feb/07
OllyDbg v1.10 And Hardware ...

NirIzr on:
Feb/05
Yet Another Anti-Debug Trick

trolotou on:
Feb/05
Doudoune Moncler -Pennies F...

waleedassar on:
Feb/01
Yet Another Anti-Debug Trick

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit