|
The Viral Darwinism of W32.Evol
Introduction
The W32.Evol virus was discovered around July 2000. Its name is derived from a string found in the virus, but much more can be implied from the name. Up until then, most of the viruses were using Polymorphic engines in order to hide themselves from Anti-Virus scanners. The engine would encrypt the virus with a different key on every generation, and would generate a small, variant decryptor that would consist of different operations but remain functionally equivalent. This technique was beginning to wear out as AV scanners would trace virus-decryption until it was decrypted in memory, visible and clear. Although Metamorphism, as a technique, appeared in several viruses in the DOS age, it got full attention from virus writers in the 32-bit environment. The idea is simple; Transformation instead of Encryption. Not just a small decryptor would be transformed, but the entire virus body. A Metamorphic engine is used in order to transform executable (binary) code. The behavior of such an engine varies from virus to virus, but many elements remain the same. A metamorphic engine has to implement some sort of an internal disassembler in order to parse the input code. After disassembly, the engine will transform the program code and will produce new code that will retain its functionality and yet will look different from the original code. According to Symantec, Evol was the first virus to utilize a 'true' 32-bit Metamorphic Engine, and so it represents another step in the evolution of Anti-AV techniques. Virus Author's Note: "The only particularity of Evol is its evolution engine - meaning that the virus will mutate every 4 copy of itself. The engine is not an usual polymorphic engine, but rather a metamorphic engine (see Benny description in 29A #4), which means that there is no encrypted code: the whole code of the virus, engine included, is variable. Furthermore, the engine inserts random code, so as to make detection by antivirus more difficult. The virus contains no fixed data : it is only a massive piece of code." The research I performed over the Metamorphic engine includes a heavily-commented disassembly of the engine available here. Information regarding the behavior of the virus itself is not included in this paper and is available on many Antivirus websites on the Internet, namely the Symantec website. Legend
Examples in this paper include shortened naming of assembly language expressions:
Reg � Register (i.e. EAX, EBX)
Mem � Memory address (i.e. [EAX])
r/m � Register or Memory
imm � Immediate Value (i.e. OP Reg, ACABh)
OP = {ADC, ADD, AND, CMP, OR, SBB, SUB, XOR}
OP1 = {DIV, IDIV, IMUL, MUL, NEG, NOT, TEST}
OP2 = {RCL, RCR, ROL, ROR, SAL, SAR, SHL, SHR}
Calling
The engine is called in the following standard-issue fashion:
push [ebp+var_14] ; *outBuf (EDI)
call GetSizeOfCode
push eax ; sizeOfCode
call SeekStartOfVirus
push eax ; *inBuf (ESI)
call MetaEngine
cmp eax, 0
jz short EngineFailed
Or, in psuedo-C:
MetaEngine(*inBuf, sizeOfCode, *outBuf);
Where, *inBuf is a pointer to the code, sizeOfCode is the size, *outBuf is the output to the destination buffer where the mutated code will be stored. Code Analysis
The engine will perform a analysis over the given code. Aside from on-the-fly disassembly, the virus will allocate 4 table entries for each instruction it analyzes. Each of the entries is a double-word. The structure is accessed in the following way:
When the engine first loads, it will use allocate SizeOfCode*16 bytes using VirtualAlloc for the purpose of the above-mentioned table. In the end, theses bytes will be freed using VirtualFree. The virus itself uses internal 'caller' functions (callVirtualAlloc / callVirtualFree), and doesn't call the API's directly. Every time the engine loads a new instruction for analysis, the first two members of the structure are filled, and the 3rd member is zeroed for later use. The 3rd and 4th fields will only be filled in case the engine analyzes a branch instruction (JMP/Jcc/CALL), to be used when the relocations will be fixed, after the mutation process is complete. The engine will disassemble only instructions that the author had included, meaning it would fail with unrecognized / unsupported instructions. Sample Disassembly:
cmp al, 8Ah ; MOV r8, r/m8?
jz short _Mutate?
cmp al, 8Bh ; MOV r32, r/m32?
jz short _Mutate?
cmp al, 8Dh ; LEA r32, mem?
jz short _Mutate?
As you can see, the engine simply checks for the current opcode, and if it is recognized by the engine, it will take an action accordingly.
Code Transformations
I. Instruction Transformation The engine supports several kinds of instruction mutations, meaning it will write different code with the same functionality. The defined transformations are divided into two parts:
The Intel instruction format allows different binary encoding for the same action. The engine supports the following alternative encodings:
III. Fixed Transformations The engine will replace the following bytes with the corresponding sequences:
As you can see, these instructions do not have any parameters passed onto them, thus simply being replaced with their corresponding functionality. IV. Junk-Code Insertion The engine will generate instructions that are not reliant upon the original code, and their functionality is essentially "do-nothing". The junk instructions will only be added if the last written byte is between 50h to 52h (PUSH EAX/ECX/EDX).
- MOV r32, [ebp+Random8]
- MOV r32, Random32
- OP r32, Random32 ;ADC/ADD/AND/OR/SBB/SUB/XOR
- MOV RandomReg8, Random8
It may be noted, however, that these instructions actually do alter the original code flow as they are random and inserted in places in which they will be executed, but these instructions are inserted after PUSH instructions, so we assume the registers will be modified later on. V. General Instructions In any other case the engine will store the instruction as is, aside from exceptional opcodes:
Relocation Fixups
After the mutation process is completed, the engine fixes instruction relocations. Due to the fact that many times the transformation process results in growth of code, most (if not all) of the branch instructions will lead to an incorrect place in the destination buffer. The engine will utilize the relocation-table it created during the mutation process, and it will patch the new address into place. First, it will loop through the table. For every instruction it will add the 1st and 4th fields (InputIP + NewRelative), thus calculating a virtual original destination. It will then set a second loop that will search for that destination, and patch the entry using the 3rd and 4th fields. Other Features
I. Anti Debugging If the engine will detect a breakpoint over the code it mutates, it will jump to the following routine:
AntiDebug:
cmp byte ptr [ebx+7], 0BFh ; are we in kernel mode?
jnz short ret_AntiDebug
mov ecx, 1000h ; counter = 1000h
mov edi, 40000000h
or edi, 80000000h
add edi, ecx ; edi = C0001000h
rep stosd ; copy bytes to [edi]
ret_AntiDebug:
retn ; this will result in a crash
The above routine can also be considered as 'external', as it is called from the main virus body as well as from the engine. II. Internal Functions The engine contains several functions that it uses for many actions:
Sample Transformations
Presented below are the actual transformations performed by the engine on itself.
B9 00 10 00 00 mov ecx, 1000h
Transformed:
B9 10 B2 00 3C mov ecx, 3C00B210h
81 C1 F0 5D FF C3 add ecx, 0C3FF5DF0h ; ecx = 1000h
Some of the "external" transformations:
As you can see in the above examples, the mutated byte-sequences are entirely different then the original ones. Conclusion
The analysis of the virus engine took me a lot of time, mainly due to the fact that it was done statically, without running the code. I hope this paper helps to shed more light on the idea of how metamorphism is done, as well as the aspects involved in the design of such an engine. Further, I'd like to thank the author of this engine, for creating this piece of code that enhanced my interest in this particular field. I encourage you to look over the reversed source-code of the engine, as it will probably make all things written above a little bit more clear. Thanks for reading this. As always, any feedback is always welcome. Orr www.antilife.org/ | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||