I've found that a lot of windows binaries I disassemble contain big series of 0x90909090 between functions, and sometimes 0xCCCCCCCC. (I know that 0x90 is NOOP -- why CC?)
It would be desirable to automatically convert these blocks to ALIGN directives, and to recognize that a new object (function or data) usually immediately follows.
In data sections, I also see big blocks of zeroes that could be made into ALIGN blocks. For many of these situations, IDA is smart enough to "do the right thing" when I go to the start of the "align block" and hit L. This gets tiresome when analyzing a big binary. Is there a good way to automate this?
Here's the best I've come up with, and it doesn't always work. Please feel free to use it, and please give me suggestions for improvement. Thanks!
Ben
from idaapi import *
from idc import *
seg=getseg(get_screen_ea())
ea=seg.startEA
while ea<seg.endEA:
ea=find_unknown(ea,1)
if(Byte(ea)==0x90 or Byte(ea)==0xCC):
start_addr=ea
start_byte=Byte(ea)
end_addr=start_addr+1
while (end_addr < seg.endEA) and (Byte(end_addr)==start_byte):
end_addr=end_addr+1
print "Found a range of %x at %x, len = %x\n" % (start_byte,start_addr,end_addr-start_addr)
doAlign(start_addr,end_addr-start_addr,0)
ea=end_addr
if isUnknown(getFlags(ea)):
auto_make_code(ea)This has a few flaws:
* Doesn't handle blocks of 0 inside data segments
* Only finds align blocks inside of unknown parts of code -- if IDA has gone ahead and coverted my function into an array of dwords, all I can do is run another script to undefine all the data arrays in the program.
There has to be an easier way to do this, right?
Ben






