📚 OpenRCE is preserved as a read-only archive. Launched at RECon Montreal in 2005. Registration and posting are disabled.








Flag: Tornado! Hurricane!

 Forums >>  IDA Pro  >>  IDC script

Topic created on: May 23, 2008 02:44 CDT by petroleum .

This may be a trivial matter, but since i haven't yet delved into the wonders of IDC scripting I figured asking here may be the way to go.

What i need, is to output a file with the first 0x400 bytes from the entry point of a given PE file.

This needs to be done 'en masse' for multiple files..

for example:
for %i in (directory\*) do  IDA -A -SEP_bytes.idc >> output.txt

The idea is to traverse the whole of the directory, then load each PE file.. extract the first 0x400 bytes and write them to the output file.
(each input PE, would need its own output file, but that's a trivial matter)..

any suggestions?

Ty

  ero     May 23, 2008 04:08.21 CDT
Using IDA for that sounds a bit of an overkill. I'd use something like pefile which is gonna be probably faster and more customizable than launching IDA in batch mode just to extract a few bytes at EP.
The last usage example in this page can give you and idea on how to go about doing what you want.

  lydia   May 23, 2008 04:11.21 CDT
hi,

Based on your need, i don't think there's something to do with Ida. I mean you don't have to write idc script.
Any program language can help you, parse pe->get entry point->read 0x400 bytes from there->save to file->done.

Regards,

lydia

  nezumi     May 23, 2008 06:53.40 CDT
this is a quick-n-dirty prg, it works with EXE/DLL and supports batch processing.
for %%A IN (*.exe, *.dll) DO dump.exe %%A


#include <stdio.h>
#include <windows.h>

main(int c, char **v)
{
      #define PE_off   0x3C
      #define EP_off   0x28
      #define SW_BP    0xCC
      #define F_pre    "-dump"
      #define def_sz   0x400

      DWORD pe_off; DWORD ep_off; BYTE* ep_adr;
      FILE *f; BYTE* base_x; char buf[_MAX_PATH];

      #define arg_src  (v[1])
      #define arg_len ((c>2)? (atol(v[2])? atol(v[2]): def_sz): def_sz)

      if (c < 2) return
            printf("USAGE: dump4ep.exe filename [n_bytes]\n");

      if (!((strlen(arg_src) + sizeof(F_pre)) < _MAX_PATH))  return
            printf("-ERR:file name %s%s is too long\x7\n", arg_src, F_pre);

      sprintf(buf,"%s%s",arg_src,F_pre);
      
      // DONT_RESOLVE_DLL_REFERENCES flag prevents DllMain execution
      // btw, VS 6 MSDN has a mistake in LoadLibraryEx description:
      // DONT_RESOLVE_DLL_REFERENCE instead of DONT_RESOLVE_DLL_REFERENCES
      // if (!(base_x = (BYTE*) LoadLibraryEx(arg_src,0,DONT_RESOLVE_DLL_REFERENCES)))
      // LoadLibraryEx(,,DONT_RESOLVE_DLL_REFERENCES) has a nasty side-effect:
      // operation system shows nag-screen every time you try to load broken PE-file,
      // while LoadLibraryEx(,,LOAD_LIBRARY_AS_DATAFILE) has no this effect,
      // it just returns null (error occurred).
      // --------------------------------------------------------
      // LOAD_LIBRARY_AS_DATAFILE works faster,
      // but sets-up the lowest bit in HINSTANCE, so we have to clean it
      if (!(base_x = (BYTE*) LoadLibraryEx(arg_src, 0, LOAD_LIBRARY_AS_DATAFILE))) return printf("-ERR: LoadLibrary(%s)\x7\n", arg_src);
      // clean the lowest bit of the base address if necessary
      if ((DWORD)base_x & 1) base_x--;

      // we're supposed to check if pe_off is correct,
      // but we're too lazy, so, we just call IsBadReadPtr()
      // ugly hack, just to prevent exception
      #define PE_OFF ((DWORD*)(base_x + PE_off))
      if (IsBadReadPtr(PE_OFF, sizeof(DWORD)))
            return printf("-ERR:bad PE offset\x7\n");

      pe_off = *PE_OFF;

      // the same ugly hack
      #define EP_OFF ((DWORD*)(base_x + pe_off + EP_off))
      if (IsBadReadPtr(EP_OFF, sizeof(DWORD)))
            return printf("-ERR:bad EP offset\x7\n");

      if (!(ep_off = *EP_OFF)) return
            printf("-ERR:%s no EP!\x7\n", arg_src);

      ep_adr = base_x + ep_off;

      if (IsBadReadPtr(ep_adr, arg_len)) return
            printf("-ERR:can't dump %d bytes\x7\n",arg_len);

      // dump arg_len bytes
      f = fopen(buf,"wb"); fwrite(ep_adr, 1, arg_len, f);
      
      return 1;
}

  ero     May 23, 2008 08:14.28 CDT
mmm, the quick-n-dirty program (hence the name i guess ;) takes for granted the offset of the PE header... it might blow up in non-standard PE images. I'd say, go with a PE parsing library, there's a reason why they exist ;)

  rakish     May 23, 2008 08:35.13 CDT
maybe he wanna use ida cuz these files are packed with something... run till find OEP with some kind of plugin and get the 400 b ... no?

sorry if i misunderstood

  nezumi     May 23, 2008 10:46.37 CDT
ero
you're right, man! well, I updated the dumper, it still doesn't check if EP is valid, but at least, uses IsBadReadPtr() to prevent an exception. new version is a bit faster, btw.

rakish
ok. this is it. checked on IDA 4.7.

static main()
{
      auto a, f, start_ea;
      start_ea = LocByName("start");
      if (!start_ea)
            return Message("-ERR:start label is not found");

      if (!(f = fopen("dumpz","wb")))
            return Message("-ERR:open file");

      for (a = 0; a < 0x400; a++)
            fputc(Byte(start_ea + a),f);

      Warning("done"); fclose(f);
}

  neoxfx     May 23, 2008 11:36.14 CDT
IDA is overkill.
Ero Pefile is neat here.

import pefile, sys

def main():
    try:
        if (len(sys.argv) < 2):
            print "Please supply file!"
            sys.exit();

        file = sys.argv[1]
        pe = pefile.PE(file,fast_load = True)
    except:
        print "problem loading file:", file
        sys.exit();

    ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint
    fw = open("dump.bin", "w")
    fw.write(pe.get_memory_mapped_image()[ep:ep+0x400])
    fw.close()

if __name__ == '__main__':
    main()

  b0ne     May 23, 2008 15:43.52 CDT
neofx's example goes to show you the simplicity and power of python, i'm not entirely sure if his 0x400 addition to ep is checked to prevent exceptions, but either way that wouldn't be super hard to fix if it wasn't.

Here's how I did it in C:


/* author: b0ne <[email protected]>, OpenRCE.org example */
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

HANDLE pe_file;
HANDLE pe_file_map;
void *file_data;

typedef PIMAGE_NT_HEADERS WINAPI ImageNtHeader_t(PVOID ImageBase);

void *map_file(char *file_name)
{
    if ((pe_file = CreateFile(file_name, GENERIC_READ,
        FILE_SHARE_READ, NULL, OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL, NULL)) == INVALID_HANDLE_VALUE)
    {
        fprintf(stderr, "ERROR accessing %s\n", file_name);
        return NULL;
    }
    
    if ((pe_file_map = CreateFileMapping(pe_file, NULL, PAGE_READONLY|SEC_IMAGE, 0, 0, NULL)) == NULL)
    {
        fprintf(stderr, "ERROR accessing %s\n", file_name);
        CloseHandle(pe_file);
        return NULL;
    }
    
    if ((file_data = MapViewOfFile(pe_file_map, FILE_MAP_READ, 0, 0, 0)) == NULL)
    {
        fprintf(stderr, "ERROR accessing %s\n", file_name);
        CloseHandle(pe_file);
        CloseHandle(pe_file_map);
        return NULL;
    }
    
    return file_data;
}

void free_file(void *map_data)
{
    UnmapViewOfFile(map_data);
    CloseHandle(pe_file_map);
    CloseHandle(pe_file);
    
    return;
}

int is_rva_avail(char *base, DWORD rva)
{
    MEMORY_BASIC_INFORMATION mbi;
    char *addr = base + rva;
    
    if (VirtualQuery(addr, &mbi, sizeof(mbi)) != sizeof(mbi))
        return 0;
    
    if (mbi.State & MEM_COMMIT)
        return 1;
    else
        return 0;
}

int main(int argc, char *argv[])
{
    char *file_name = argv[2];
    char *out_name;
    
    ImageNtHeader_t *ImageNtHeader = NULL;
    HMODULE dbghelp_mod;

    char *image_base;
    IMAGE_NT_HEADERS *nt;
    DWORD ep_rva;
    DWORD ep_len;
    FILE *out_file;
    
    if (argc != 3)
    {
        fprintf(stderr, "usage: EPDUMP.exe <length> <target_PE_file_name>\n\n");
        return EXIT_FAILURE;
    }

    if ((dbghelp_mod = LoadLibrary("DBGHELP.DLL")))
        ImageNtHeader = (ImageNtHeader_t *) GetProcAddress(dbghelp_mod, "ImageNtHeader");

    if (!ImageNtHeader)
    {
        fprintf(stderr, "ERROR loading DBGHELP.DLL, please install DBGHELP.DLL\n");
        return EXIT_FAILURE;
    }
    
    if ((ep_len = atoi(argv[1])) < 1)
    {
        fprintf(stderr, "ERROR entrypoint length is too small\n");
        return EXIT_FAILURE;
    }
    
    if ((image_base = map_file(file_name)) == NULL)
        return EXIT_FAILURE;
    
    if ((nt = ImageNtHeader(image_base)) == NULL)
    {
        fprintf(stderr, "ERROR finding NT headers in %s\n", file_name);
        free_file(image_base);
        return EXIT_FAILURE;
    }
    ep_rva = nt->OptionalHeader.AddressOfEntryPoint;
    
    if (is_rva_avail(image_base, ep_rva) && is_rva_avail(image_base, ep_rva + ep_len))
    {
        if ((out_name = malloc(strlen(file_name) + 5)) == NULL)
        {
            fprintf(stderr, "ERROR allocating memory, free some memory and try file %s again\n", file_name);
            free_file(image_base);
            return EXIT_FAILURE;
        }
        sprintf(out_name, "%s.bin", file_name);
        
        if ((out_file = fopen(out_name, "wb")) == NULL)
        {
            fprintf(stderr, "ERROR opening output file %s\n", out_name);
            free_file(image_base);
            FreeLibrary(dbghelp_mod);
            return EXIT_FAILURE;
        }
        if (fwrite(image_base + ep_rva, ep_len, 1, out_file) != 1)
        {
           fprintf(stderr, "ERROR writing %lu bytes to output file %s\n", ep_len, out_name);
           free_file(image_base);
           fclose(out_file);
           FreeLibrary(dbghelp_mod);
           return EXIT_FAILURE;
        }
        fclose(out_file);
    }
    else
    {
        fprintf(stderr, "ERROR entrypoint 0x%0lx + %lu bytes is outside of IMAGE %s\n", ep_rva, ep_len, file_name);
        free(out_name);
        free_file(image_base);
        FreeLibrary(dbghelp_mod);
        return EXIT_FAILURE;
    }
    
    printf("Output File: %s\nEntrypoint RVA: 0x%0lx\nBytes: %lu\n", out_name, ep_rva, ep_len);
    
    free(out_name);
    FreeLibrary(dbghelp_mod);
    free_file(image_base);
    
    return EXIT_SUCCESS;
}

  nezumi     May 23, 2008 17:03.02 CDT
neoxfx
with all my respect I feel obligation to say that your way in not Zen-way :-) it hides everything under hood. kind of Windows way: run it and don't think how it works. you can't control loading process, you just use library calls - real puzzle. personally I don't know how pefile.PE(file,fast_load = True) works. does it load PE via system APIs? does it executes DllMain? btw, your example works wrong if EP = 0.

b0ne
I see no sense to use file mapping. do you think that is faster than LoadLibraryEx (LOAD_LIBRARY_AS_DATAFILE)? and dbg engine - wow!!! PE header is well documented and I don't believe that PE/EP offsets will be changed in the future.
64-bit OSes have almost the same PE-header, at least PE/EP offsets are DWORD pointers. so, why we need to use libraries to parse PE-structure?! parsing PE-header by our own hands we always can open Microsoft Portable Executable and Common Object File Format Specification to be sure that everything is correct. we can't rely on 3rd parties libraries.

  b0ne     May 23, 2008 18:00.17 CDT
Isn't a general rule of thumb when programming to always use symbols instead of magic constants for that very reason?

It would really be just as easy to cast the base address to (IMAGE_DOS_HEADER *) and access the e_lfanew member, add that to the image base and assign it to the IMAGE_NT_HEADER pointer.

File mapping is far more eloquent than LoadLibrary + fixup hacks which may not be reliable in the future.  As for the performance, the windows loader uses file mapping, so why not cut out all the "crap" that sits on top if we just want to map the file into memory?

  BegPardon     May 23, 2008 18:07.31 CDT
with all my respect I feel obligation to say that your way in not Zen-way :-) it hides everything under hood. kind of Windows way: run it and don't think how it works. you can't control loading process, you just use library calls - real puzzle. personally I don't know how pefile.PE(file,fast_load = True) works.

Ero's pefile is open source; feel free to read it and find out how it works (it's purely static).

does it load PE via system APIs? does it executes DllMain? btw, your example works wrong if EP = 0.

The answers are "no" and "no". Please explain how the answer is wrong for EP=0.

I see no sense to use file mapping. do you think that is faster than LoadLibraryEx (LOAD_LIBRARY_AS_DATAFILE)? and dbg engine - wow!!!

Speed is not important in a one-off task such as this, but I'd bet money it is faster than LoadLibraryEx, due to the fact that LoadLibraryEx maps the file into memory itself plus does a bunch of other work.

PE header is well documented and I don't believe that PE/EP offsets will be changed in the future.
64-bit OSes have almost the same PE-header, at least PE/EP offsets are DWORD pointers. so, why we need to use libraries to parse PE-structure?! parsing PE-header by our own hands we always can open Microsoft Portable Executable and Common Object File Format Specification to be sure that everything is correct. we can't rely on 3rd parties libraries.


Actually that "3rd-party library" is Microsoft's own debugging support library, so you can be pretty sure it's at least as good as whatever you come up with by hand... that said I probably would have just used casts to Windows' internally-defined data structures in b0ne's C example.

  nezumi     May 23, 2008 19:47.48 CDT
BegPardon
> Ero's pefile is open source; feel free to read it and find out how it works (it's purely static).
nolo contendere! Ero's pefile is a great stuff, this is not a debatable question!
I just wanted to point out that parsing PE-file is very simple task, or very trickily at the same time. guess, the file has relocations, designed not to rebase it, but to patch some byte to make reverse harder. using LoadLibraryEx we can force system to load file _with_ relocations or _without_ them.
and besides, there are so many wrappers, libraries, layers of abstraction... I prefer use "bare" win32 API instead of bunch of something I have to learn to... speed up my job? I doubt. anyway, I showed how to load PE-file and dump bytes from EP, using win32 API.
personally I don't like Python very much.

> Please explain how the answer is wrong for EP=0.
if EP = 0 the follow code doesn't report about error and dumps 0x400 byes from the beginning of the file. many DLL files have EP == 0.


ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint
fw = open("dump.bin", "w")
fw.write(pe.get_memory_mapped_image()[ep:ep+0x400])


> Speed is not important in a one-off task such as this,
using LoadLibraryEx makers your code shorter, keeping the same speed. so, what's the reasons to use memory mapping?!

> it is faster than LoadLibraryEx,
> due to the fact that LoadLibraryEx maps the file into memory
> itself plus does a bunch of other work.
not with LOAD_LIBRARY_AS_DATAFILE flag

> Actually that "3rd-party library" is Microsoft's own debugging support library,
I did mean Ero's pefile, and ms dbg engine changeable and you have to download it (not everybody uses it, so this is definitely bad idea to ask ms dbg engine where the EP is, it's too expensive, better to write a couple lines code, working everywhere)

b0ne
> Isn't a general rule of thumb when programming to always use
> symbols instead of magic constants for that very reason?
well, if you don't like "magic", define a structure, but in our case, we just need to get only two offsets, so I see no reasons to use PE-structures, and besides, there is no "magic", I defined the offset of used fields.

> File mapping is far more eloquent than LoadLibrary + fixup hacks
where you see "fixup hacks"?! this is not "hack", this is well-documented way to load PE file as database.

> which may not be reliable in the future.
according to whom?! I can't imagine that LoadLibraryEx will stop working, at the same time... I'm not sure about direct file mapping.

> As for the performance, the windows loader uses file mapping,
> so why not cut out all the "crap" that sits on top
> if we just want to map the file into memory?
actually, that "crap" is quite thin, but using LoadLibraryEx provides more flexible control and simplifies your code a lot. I don't understand, why you insist that manual mapping is a better way?!

  neoxfx     May 23, 2008 23:02.32 CDT
@nezumi
>if EP = 0 the follow code doesn't report about error and dumps 0x400 byes from the beginning of the file. many DLL files have EP == 0.

when ep=0, it dumps from start of imagebase(which happens to be start of file).
its easy to just add a handler statement like [feel free to add debug stats:-)],
"if ep == 0: print 'no EP'" and return

however, your code will break when the VA from entry point to dump length is discontinuous.
try your code with any packed file with EP section having small ep stub and with "sect_virtual_size > sect_physical_size". [i.e. ep+0x400 should go beyond current section and that Vitual size of curr section is bigger than physical size]
take upx packed files for example, this is a common case.
you get, "-ERR:can't dump 1024 bytes"

sometimes it is simpler to operate with fileoffsets(without involving loader, pefile tries to mimic everything statically) :-), however agreed that both the methods have pros and cons.

Note: Registration is required to post to the forums.

There are 31,328 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06
Question about memor...
Dec/12


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit