Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Tuesday, March 6 2012 04:07.05 CST Modified: Tuesday, March 13 2012 04:55.36 CDT
Printer Friendly ...
[video] Semi-Automated Input Crafting by Symbolic Execution, with an Application to Automatic Key Generator Generation
Author: RolfRolles # Views: 29553

The problem of input crafting can be stated as follows:  given the ability to influence some aspect of a program's execution (say, by supplying it with a hand-crafted file, network input, sequence of keystrokes, etc), we want to know whether it is possible to cause the program to reach some state, for example, a condition where the integer argument to some memory allocation function is zero, a specific sequence of branches being taken causing EIP to obtain a certain value, an array dereference whose index is not within its bounds, etc.  Much work in reverse engineering for vulnerability analysis and software cracking reduces to this problem; some malware analysis problems can also be stated in such terms.

It is well-known in the formal verification literature that symbolic execution, a method for reasoning about program executions based upon formal semantics and theorem proving, may be applied towards this problem.  However, actual tools that can be used in a friendly fashion have not yet materialized in the hands of the common, working reverse engineer.  This blog entry demonstrates a prototype of a tool of this nature.

Crafting inputs with manually-generated SMT instances

We take as our example program a crackme called Kao's Toy Project, which recently attracted the scrutiny of Dcoder and andrewl, two clever computer-hacking mathematician types.  Note:  I would have used a vulnerability analysis example since it might be more compelling to a general audience, however, the notion of producing a reverse engineering video involving commercial software is not a palatable one for legal reasons, and therefore I chose a program that was deliberately designed to be reverse engineered.

The crackme implements the following scheme.  When presented with a 32-byte activation code, the user enters a hexadecimal string such as "01234567-76543210", which is then decomposed into two dwords, one of which is XORed with the other, and the following compact loop is executed:

.text:004010F7 mov     esi, offset a_ActivationCode
.text:004010FC lea     edi, [ebp+Output]
.text:004010FF mov     edx, [ebp+arg_0__SerialDwLow]
.text:00401102 mov     ebx, [ebp+arg_4__SerialDwHigh]
.text:00401105 compute_output:
.text:00401105 lodsb
.text:00401106 sub     al, bl
.text:00401108 xor     al, dl
.text:0040110A stosb
.text:0040110B rol     edx, 1
.text:0040110D rol     ebx, 1
.text:0040110F loop    compute_output
.text:00401111 mov     byte ptr [edi], 0
.text:00401114 push    offset String2                  ; "0how4zdy81jpe5xfu92kar6cgiq3lst7"
.text:00401119 lea     eax, [ebp+Output]
.text:0040111C push    eax                             ; lpString1
.text:0040111D call    lstrcmpA

Given that the user's input consists of two dwords, the key space is 2^64.  Dcoder and andrewl offer an observation that halves the exponent to 2^32.  Dcoder further offers an algebraic cryptanalysis of the scheme that obliterates it.  andrewl chose to explore the route of modelling the scheme in terms of the Boolean Satisfiability (SAT) problem, and then feeding the result into a SAT solver.  The resulting SAT instance is solved more or less immediately.

We followed andrewl's line of reasoning and manually constructed a representative Satisfiability Modulo Theories (SMT) formula.  (To be technical, the formula is a sentence in a quantifier-free fragment of first-order logic restricted to the theories of equality, bitvectors, and extensional arrays.)  The formula does not explicitly take advantage of any aforementioned, known flaw in the cryptosystem, and instead encodes the algorithm literally.  We believe that the reader will find the resulting SMT formula very clear and easy to understand.  This is an advantage over a SAT instance, which is virtually incomprehensible and harder to debug.  The SMT instance is a bona-fide key generator, and furthermore the theorem prover is able to solve the formula in milliseconds (considerably faster than a brute-force search over a 64-bit keyspace).

We outline the SMT instance:  Line 1 sets the theory. Lines 3-7 declare the variables of interest, as well as the array sort. Lines 9-40 encode the serial algorithm. Lines 42-73 state that the output array must equal the fixed value given in the crackme. Lines 75-106 encode the activation code that was given by a given instance of running the crackme. You can replace these lines with whatever your own activation code was, thereby creating a key generator. Lines 108-110 encode the additional constraints where one of the serial dwords was XORed with the other. Lines 112-113 query the decision procedure and output the results.  After installing Z3 (other theorem provers such as Yices are also suitable), one solves the formula as follows:  C:\Program Files (x86)\Microsoft Research\Z3-3.2\bin>z3 /m /smt2 \temp\kao.smt.

Generating the constraint system automatically, directly from the binary

This is all well and good, but manually encoding algorithms as SMT instances is tedious business.  Instead, we seek a solution that can automatically generate the relevant constraint system directly from the x86 binary.  The current solution is a static analysis (i.e. it does not alter or observe the execution of the program).

The algorithm proceeds in three main phases:

1)  Trace generation.  Since the algorithm is static, the analyst manually specifies some parameters to the system:  namely, an address from which to start the analysis, a model of the initial values of the registers and memory, and some condition that dictates when the analysis should stop.  (Note that, if we were to reformulate our analysis in a dynamic setting, many of these considerations would be superfluous.)  The tool then statically emulates the program, generates a list of instructions encountered during the emulation, and converts them into an intermediate representation.

2)  Trace simplification.  We apply other static analyses against the trace to simplify it, which could potentially speed up the solving.  Analyses could include constant propagation, dead statement elimination, or the abstract interpretation that I published last year.  This step is actually optional in the sense that the simplifications preserve satisfiability.

3)  Constraint generation and solving.  With our trace in hand, we transform it into an SMT instance and feed the resulting equations into a theorem prover.  Additionally, the user supplies some postcondition that corresponds to the state that he or she wishes for the program to enter.  We then check satisfiability, and if the formula is indeed satisfiable, the theorem prover can furnish a model of the inputs that cause the desired state to be reached.

With all of that out of the way, here is a link to the actual video depicting the process.  I apologize if the video is tedious (I am tedious) and for the stupid manner in which I sound (I am stupid, and also I am from the American south).  The files referenced in the video, kao-1.ml through kao-3.ml, are linked to above in the description of the algorithm.

Key-generator generation

Given the construction laid out in the video referenced above, generation of a key generator is now trivial.  Instead of asserting in the postcondition that the activation code is fixed to whatever value was observed in the construction of the constraint system, we allow the user to enter his or her own activation code, which we then assert as the initial values of the activation code array.  We then solve the system as described in the video and part 3 of the code linked above, and provide the user with the proper registration code (EBX and EDX values).


We use static binary program analysis to semi-automatically produce a reasonably-efficient key generator for a weak cryptosystem.


An astute proofreader wondered why, during the video, when I solved the manually-constructed instance with Z3, it reported an error about a model not being available.  In fact, the formula solved correctly (as evidenced by the output saying "sat"), but I did not specify the "/m" flag on the command line due to it skewing the time statistics, which resulted in that message being produced.  The reader can manually verify that the formula is correct with the command line given previously.

Blog Comments
serpi Posted: Tuesday, March 6 2012 09:36.08 CST
It remembers me "fuzzgrind" (https://www.sstic.org/2009/presentation/Fuzzgrind_un_outil_de_fuzzing_automatique/). It fuzzes programs using path discovery, resolving constraints with smt sover using Valgrind and STP.

Erf, its in French :/

*advertising* Miasm can also do symbolic execution for generating such constraints, but it lacks a translation engine between its intermediate langage and SMT langage.
the *simple* example  is here:

RolfRolles Posted: Tuesday, March 6 2012 16:44.05 CST
The difference with fuzzgrind is the semi-automation (e.g. the ability for the analyst to interact with the technology) and the fact that what I presented is a static analysis, whereas fuzzgrind is dynamic.  The difference with Miasm is that (at least) it does integrate with SMT solvers.

withzombies Posted: Thursday, March 8 2012 07:10.35 CST
In your video, you mentioned that your research is IL-based and you referenced David Brumley's group at CMU, but you never mention many details about your implementation.

BitBlaze uses Valgrind's VEX to "lift" the native code into an IL, then they translate VEX to their own VINE IL. My understanding is they did this due to the relative complexity of VEX and VEX's failure to model flag operations explicitly.

Does your implementation use VEX+VINE, or did you go an another route?

RolfRolles Posted: Thursday, March 8 2012 12:40.55 CST
I use an IR very similar to the one used by BitBlaze and BAP, but I wrote my own IR translator rather than using VEX.  My implementation is written-from-scratch and shares no code with the open-source frameworks.

94c3 Posted: Monday, March 19 2012 20:21.51 CDT
Does your IR have any advantages over BAP?

RolfRolles Posted: Monday, March 19 2012 20:39.49 CDT
At present I would say "no", just minor differences.  Inspired by the performance differences between the manually-crafted instance and the automatically-generated one, and also this paper, I am in the process of re-architecting the IR and the translator to better suit SMT solvers and expose relational information explicitly.  (I came up with a cute trick for prototyping new IRs without altering the framework.)  But progress is slow; I only work on this project infrequently, as my research doesn't pay anything and begins to bore me.

Add New Comment

There are 31,260 total registered users.

Recently Created Topics
Reverse Engineering ...
let 'IDAPython' impo...
set 'IDAPython' as t...
GuessType return une...
About retrieving the...
How to find specific...
How to get data depe...
Identify RVA data in...
Question about memor...

Recent Forum Posts
Finding the procedur...
Question about debbu...
Identify RVA data in...
let 'IDAPython' impo...
How to find specific...
Problem with ollydbg
How can I write olly...
New LoadMAP plugin v...
Intel pin in loaded ...
OOP_RE tool available?

Recent Blog Entries
Breaking IonCUBE VM

Anatomy of a code tracer

IAT Patcher - new tool for ...

CryptoShark: code tracer ba...

Build a debugger in 5 minutes

More ...

Recent Blog Comments
nieo on:
IAT Patcher - new tool for ...

djnemo on:
Kernel debugger vs user mod...

acel on:
Kernel debugger vs user mod...

pedram on:
frida.github.io: scriptable...

capadleman on:
Using NtCreateThreadEx for ...

More ...

SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit