David Reguera Garca (Dreg) <Dreg fr33project org> |
Monday, January 11 2010 15:03.25 CST |
Hello OpenRCE, Today I will talk about the X86IME engine (OpenSource), this engine is a x86 and x86_64 (32/64bits) disassembler/assembler of my friend Pluf.
The engine:
It exist an intermediata object called x86im_instr_object, with this object you can: generate intructions, view dissasembly like a LDE or like INTEL syntax directly:
typedef struct _x86im_instr_object // x86 decoded/generated instruction:
{
unsigned long mode; // mode: 32/64bits
unsigned long flags; // instr flags
unsigned long id; // instr id
unsigned long grp; // instr grp & subgrp
unsigned long mnm; // instr mnemonic
unsigned long len; // total instr length
unsigned char def_opsz; // default operand size: 1/2/4/8
unsigned char def_adsz; // default address size: 16bit = 2 | 32bit = 4 | 64bit = 8
unsigned char opcode[3]; // instr opcodes: up to 3
unsigned char opcode_count; // instr opcode count
unsigned short prefix; // instr prefixes ( mask )
unsigned char prefix_values[4]; // prefixes
unsigned char prefix_count; // instr prefix count
unsigned long prefix_order; // instr prefix order
unsigned char rexp; // REX prefix
unsigned char somimp; // mandatory prefix: SOMI instr only: 0x66|0xF2|0xF3
unsigned char n3did; // 3dnow instr id
unsigned char seg; // implicit segment register used by mem operands:
unsigned char w_bit; // wide bit value: 0/1 - if IF_WBIT
unsigned char s_bit; // sign-extend bit value: 0/1 - if IF_SBIT
unsigned char d_bit; // direction bit value: 0/1 - if IF_DBIT
unsigned char gg_fld; // granularity field value: 0-2 ( mmx ) - if IF_GGFLD
unsigned char tttn_fld; // condition test field value: if IF_TTTN
unsigned short selector; // explicit segment selector used by CALL/JMP far: IF_SEL
unsigned long imm_size; // imm size: 0 | (1/2/4/8)
unsigned long long imm; // imm value: 64bit max value ( if imm_size != 0 )
unsigned long disp_size; // disp size: 0 | (1/2/4/8)
unsigned long long disp; // disp value: 64bit max value ( if disp_size != 0 )
unsigned char mem_flags; // mem flags: src/dst/..
unsigned short mem_am; // addressing mode
unsigned short mem_size; // operand size ( xxx ptr )
unsigned char mem_base; // base reg : grp+id
unsigned char mem_index; // index reg: grp+id
unsigned char mem_scale; // scale reg: grp+id
unsigned char modrm; // modrm byte value & fields: if IF_MODRM
unsigned char sib; // sib byte value & fields: if IF_SIB
unsigned long rop[4]; // imp/exp reg op array
unsigned char rop_count; // imp/exp reg op count
unsigned int status;
void *data;
} x86im_instr_object;
To dissasembly a instruction you need the parameters: x86im_instr_object, the mode X86IM_IO_MODE_32BIT or 64BIT, the data, is the buffer with the instruction.
int __stdcall x86im_dec( __inout x86im_instr_object *io,
__in unsigned long mode,
__in unsigned char *data )
Example of dissasembly of POP EAX instruction:
x86im_instr_object io;
char *d = "\x58"; /* POP EAX, OPCODE */
x86im_dec( &io,
X86IM_IO_MODE_32BIT,
d );
You can access to INTEL syntax string with io.data
To generate an instruction, you need two steps, first generate a valid instruction with the code and operands reg/mem/disp/imm:
int __stdcall x86im_gen( __inout x86im_instr_object *io,
__in unsigned long options,
__in unsigned long code,
__in unsigned long reg,
__in unsigned long mem,
__in unsigned long long disp,
__in unsigned long long imm )
Example of the generation of a POP EAX instruction:
x86im_instr_object io;
x86im_gen( &io,
X86IM_IO_MODE_32BIT|X86IM_GEN_OAT_NPO_D,
X86IM_GEN_CODE_POP_RG1,
X86IM_IO_ROP_ID_EAX, 0, 0, 0 );
There are many macros very very useful in the headers, like X86IM_GEN_CODE_POP_RG1 or macros like: X86IM_IO_IS_GPI_ADC(x) to check the ( ( (x)->id & 0xFFF0 ) == 0x0060 ), with this macros the code is very intuitive and you do not need hardcode values with many coments... IMHO, of course.
The nex step is the instruction encode with the x86im_enc interface:
int __stdcall x86im_enc( __inout x86im_instr_object *io,
__out unsigned char *data )
With this function you get the real instruction in data buffer, to get the raw instruction in data of the POP EAX instruction generated in io with x86im_gen instruction:
x86im_instr_object io;
char data[1];
x86im_enc( &io, data );
Now, you can dump the raw instruction stored data in somewhere.
With this powerful engine you can generate the same instruction with redundancy, for example of the ADD instruction:
Raw instruction: 03 C3
INTEL representation: ADD EAX, EBX
Mod:11, reg:000 and r/m:011
The same representation is with this raw: 01 D8, Mod:11 reg:011 and r/m:000.
You can generate any redundancy using the macros without hard values.
Donwload X86IME v1.0: http://sites.google.com/site/x86pfxlab/projects
The examples of the engine are very useful,
instdec: sample instruction dissasembler.
instgen: sample instruction generator, in the output you can view the code redundancy etc.
Patch to compile in UNIX by nibble: http://nibble.develsec.org/get/x86im-1.0b.tar.gz
CFLAGs for WINDOWS:
CFLAGS+=-D__WINDOWS__=1
CGLAGS in UNIX:
CFLAGS+=-D__UNIX__=1
It exist a unix sample version with makefiles for each win sample.
Sincerely, Dreg.
|
Thanks for your post! The X86IME engine looks very interesting. I'll take a look ;> |
|
Post updated: added the UNIX stuff. |
|
Thanks Dreg! Nice post :) |
|