Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Friday, December 25 2009 13:45.13 CST  
Printer Friendly ...
Code release: C-subset compiler in Objective Caml
Author: RolfRolles # Views: 10159

Here is the source code for a compiler that I wrote in Objective Caml this semester, for a subset of the C language.  It requires a standalone C->IR translator which is not included in this release, as the school owns the copyright on that particular piece of code.  Hence one cannot immediately use this compiler to compile C programs without writing a C front end; anyway, an existing compiler such as MSVC or GCC would be a better choice.

The portion of the code that I wrote (everything except bitv.ml and .mli) totals roughly 3200 lines of code.  It includes two optimizations based on classical data-flow analysis, constant propagation and dead-statement elimination.  It also supports translation into and out of static single assignment form, as well as two optimizations based on SSA:  constant propagation and loop-invariant code motion.  The code for the graph-coloring register allocator is not included.  As a back-end, the compiler produces C code that can be compiled by any other C compiler.

The code is a pretty good example of how to structure medium-sized programs in functional programming languages.  I tried to adopt a pure functional style throughout most of the codebase.  Sometimes this failed:  cfg_ir.ml is unnecessarily ugly, and should have been re-written in an imperative style with mutability; also, I made the mistake of using a mutable array to describe Phi values in static single assignment whereas pure functional style would have dictated a list.  But those are my only complaints with the code; overall, I'm pretty pleased with how it all turned out.

This code is substantially more sophisticated than the compiler that I wrote to break VMProtect, so if you can read and understand this release, you should be in good shape for breaking virtualization obfuscators.


Blog Comments
RolfRolles Posted: Tuesday, January 5 2010 15:19.49 CST
Good point -- I have re-uploaded the package and included an "ir" directory with a collection of samples of C->IR translations.  Thanks.



Add New Comment
Comment:









There are 31,314 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
hi!
Jul/01
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit