Another invalid opcode representation
trufae <trufaegmailcom> Thursday, November 20 2008 08:46.24 CST


After the release of radare 1.0 somebody noted a bug in the disassembler, so we made some investigation and saw that udis86 is representing the

  83 e4 f0 as  "and esp,0xf0"

and other disassemblers (olly, gnu objdump, ..) are representing it as:

  "and esp, 0xfffffff0"

The problem is not directly related to udis86, because it is a missrepresentation of what intel really does at low level with this instruction.

In the specs says that the 83 opcode should affect only the lowest byte of the register pointed by the second byte. If this operation is performed against EAX we can properly represent the instruction as "AND AL, 0xF0", but neither EBP or  ESP has partial access representations.

The funny thing is that without having access to a part of a register following the intel syntax the cpu is able to do it, so I understand that this is a bug in the representation for all the disassemblers.

Both ones are correct to me because they will act in the same way (maybe the olly,objdump) is more correct, but it does not matches the reality of the instruction.

Comments
Dreg Posted: Thursday, November 20 2008 09:02.46 CST
In my opinion, Represent one operator of one byte with four is confusing

igorsk Posted: Thursday, November 20 2008 11:20.33 CST
I'm not sure what doc you're quoting, but in the latest instruction set reference this opcode is listed like this:
83 /4 ib AND r/m16, imm8   r/m16 AND imm8 (sign-extended).
83 /4 ib AND r/m32, imm8   r/m32 AND imm8 (sign-extended).

As you can see, no r8 in sight.
Also, what if imm8 is less than 128 (e.g. 83 E0 0F or  and eax, 0Fh)? By your claim, it would still affect only the low byte of the register, but I don't think that's what happens.

PeterFerrie Posted: Thursday, November 20 2008 14:27.32 CST
igorsk is correct - the value is sign-extended over the entire register, not just affecting the low byte.  Thus, in his example, a value less than 128 will _zero_ the upper portion of the specified register.
eax=12345678
83 e0 08 (and eax, +08)
yields eax=00000008