Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Tuesday, January 22 2008 16:20.30 CST  
Printer Friendly ...
Compiler Optimizations Regarding Structures
Author: RolfRolles # Views: 3097

Here are some optimizations that I have seen MSVC apply to structure references.  I wish I could give you the real names for these optimizations, but I can't find them in any of my compilers textbooks.  I have a feeling that they're buried away somewhere inside of Randy Allen and Ken Kennedy's incredibly dense tome, "Optimizing Compilers for Modern Architectures".  If anybody knows the real names for these transformations, please speak up.

#1:  Let's say we are accessing multiple entries in a structure that's larger than 80h.  Now as stated in the previous entry, each access to the members situated at >= 0x80 is going to require a dword in the instruction encoding if we generate the "naive" code.  If we instead do:

lea esi, [esi+middle_of_structure_somewhere]

mov eax, [esi-(middle_of_structure_somewhere - member_offset1)]
mov ebx, [esi+(member_offset2 - middle_of_structure_somewhere)]

We can access more of the structure with the one-byte instruction encoding, if those subtracted quantities are bytes.  The compiler chooses middle_of_structure_somewhere specifically to maximize the number of one-byte references.  This is the same idea behind the "frame pointer delta" stack-frame optimization.

#2:  Let's say we have a loop that accesses two arrays of structures inside of another structure, one array beginning at +1234h, the other beginning at +2234h.  If we emit the "naive" code:

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+1234h+ebx+offset_of_member1]
mov edi, [esi+2234h+edx+offset_of_member2]

Then obviously both of these structure displacements are going to require a separate dword in the instruction encoding for 1234h+offset_of_member1 and 2234h+offset_of_member2.  If we instead do:

lea esi, [esi+1234h]

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+ebx+offset_of_member1]
mov edi, [esi+1000h+edx+offset_of_member2]

Then if offset_of_member1 is a byte, it's only going to require a byte in the instruction encoding, thus saving three bytes per reference to the first structure (we can combine the previous optimization to place esi such that the number of one-byte references is maximized).  Alternatively, if more members in the second structure are accessed than those in the first, we'll see:

lea esi, [esi+2234h]

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+ebx+offset_of_member1-1000h]
mov edi, [esi+edx+offset_of_member2]

Once again, the first optimization can also be applied here to choose the optimal placement for esi that maximizes the number of single-byte references.  The multiplications given in the second optimization can also be optimized away into additions.


Blog Comments
c1de0x Posted: Wednesday, January 23 2008 15:11.31 CST
Rolf: Now we're talking!

Just one thing... you say "one byte instruction" when I think you mean "one byte operand". Not a big deal but it can get confusing.

Another place you see negative 'structure' offsets, although not strictly speaking an optimization, is in classes with multiple inheritance. In this case, you'll sometimes have virtual functions receiving pointers to 'this' and using a negative offset to cast up the inheritance hierarchy.

Again, this kind of thing can probably be detected and 'noted' automatically.



Add New Comment
Comment:









There are 28,229 total registered users.


Recently Created Topics
Reverse Engineering ...
Jan/23
Career: DoD Agency I...
Jan/22
"Disappearing&q...
Jan/17
Career: Software Sec...
Jan/11
Where is the call st...
Jan/07
IDA Pro 6.1 Breakpoi...
Jan/01
How to create data s...
Dec/30
can i search all mod...
Dec/23
IDA symbol table exp...
Dec/20
An anti-attach trick
Dec/17


Recent Forum Posts
Reverse Engineering ...
NirIzr
"Disappearing&q...
NirIzr
Reverse Engineering ...
charlie
"Disappearing&q...
charlie
An anti-attach trick
Bass
An anti-attach trick
waleeda...
An anti-attach trick
Bass
An anti-attach trick
waleeda...
An anti-attach trick
Bass
Looking for value in...
NirIzr


Recent Blog Entries
cmathieu
Feb/07
Hacker Carnival

waleedassar
Feb/06
OllyDbg v1.10 And Hardware ...

waleedassar
Jan/31
Yet Another Anti-Debug Trick

RolfRolles
Jan/22
Finding Bugs in VMs with a ...

waleedassar
Jan/13
An OllyDbg Bug Disables Sof...

More ...


Recent Blog Comments
waleedassar on:
Feb/07
OllyDbg v1.10 And Hardware ...

NirIzr on:
Feb/07
OllyDbg v1.10 And Hardware ...

NirIzr on:
Feb/05
Yet Another Anti-Debug Trick

trolotou on:
Feb/05
Doudoune Moncler -Pennies F...

waleedassar on:
Feb/01
Yet Another Anti-Debug Trick

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit