Array Indexing Qurik
RolfRolles <rolfrollesgmailcom> Wednesday, February 13 2008 18:02.38 CST


.text:10002D49 mov     eax, [esp+arg_0]
.text:10002D4D lea     ecx, [eax-9C40h]
.text:10002D53 cmp     ecx, 50h
.text:10002D56 ja      short loc_10002D60
.text:10002D58 mov     eax, dword ptr ds:(loc_1000EF5B+1)[eax*8]
.text:10002D5F retn
.text:10002D60
.text:10002D60 loc_10002D60:
.text:10002D60 lea     edx, [eax-0A029h]
.text:10002D66 cmp     edx, 9
.text:10002D69 ja      short loc_10002D73
.text:10002D6B mov     eax, dword ptr ds:loc_1000D344[eax*8]
.text:10002D72 retn


We don't find any arrays at the locations referenced on lines -D58 and -D6B (in fact we find code) which is unusual:

.text:1000EF57 movzx   eax, word ptr [esi+18h]
.text:1000EF5B
.text:1000EF5B loc_1000EF5B:                           ; DATA XREF: 10002D58
.text:1000EF5B add     dword_10065280, eax
.text:1000EF61 xor     eax, eax
.text:1000EF63 pop     esi
.text:1000EF64 mov     esp, ebp
.text:1000EF66 pop     ebp

.text:1000D342 mov     esp, ebp
.text:1000D344
.text:1000D344 loc_1000D344:                           ; DATA XREF: 10002D6B
.text:1000D344 pop     ebp


Looking closer at the code, the trick lies in the fact that the arrays are not being indexed starting at zero.

.text:10002D58 mov     eax, dword ptr ds:(loc_1000EF5B+1)[eax*8] ; <- 0x9C40 <= eax < 0x9C90
.text:10002D6B mov     eax, dword ptr ds:loc_1000D344[eax*8] ; <- 0xA029 <= eax < 0xA032


So the first array actually begins at 0x1000EF5B+1+0x9C40*8 == 0x1005D15C, and the second array begins at 0x1000D344+0x0A029*8 == 0x1005D48C.  What happened here is that the pointer expression has been simplified to conform to x86's instruction encoding:

[1005D15Ch + (eax - 0x9C40) * 8] => [1005D15Ch - 4E200h + eax*8] => [1000EF5Ch + eax*8]

This is pretty uncommon; I've only seen it a handful of times in my reversing endeavors over the years.

Comments
Posted: Wednesday, December 31 1969 18:00.00 CST