Flag: Tornado! Hurricane!

WINLDRA.EXE: Reversing a Basic Encryption Algorithm

Tuesday, August 23 2005 22:39.27 CDT
Author: Gerry # Views: 56561 Printer Friendly ...

 Introduction

This article will give you an overview on how I reverse engineered the encryption (well, obfuscation really but we will refer to it is encryption for the remainder of this article) routine of WINLDRA.EXE, an unknown binary that was used in a large scale identity theft ring. This is a beginner/intermediate level article and assumes only that the reader has an understanding of basic x86 assembly and how operations such as AND, OR, SHL and SAR work. I will walk the reader through the operations, but it will help if you understand what they are doing.

The executable under scrutiny is WINLDRA.EXE (MD5: F0B0224B75E899440C15EE05B59B6013) and was believed to be the executable used in the large scale ID theft ring found by Sunbelt. The executable, while basic, covered a lot of areas, especially with regards to data logging. Although the binary has bot capabilities its main focus is to log and steal data. It logs everything from clipboard contents, Internet Explorer Protected Storage, URL Form fields, Internet Account Manager data, WebMoney ... the list goes on. When considering the quantity of data targeted for capture it is natural curiosity to want to know where it goes. The first step in doing so was locating the routine responsible for logging. This can be approached in a few fashions. I noticed an interesting function while I was Import Address Table (IAT). The function at 004030B2 referenced both the InternetConnectA() and HttpSendRequestA() WinInet API and based on that fact I had a strong suspicion that I had located the routine responsible for data transmission. Upon analysis of the cross references to the transmit routine, I found only one, the routine at 00403187. Looking at the functionality of this routine I found a call to strcpy() with "logdata=" as an argument. At this point it was clear to me that this function pair was responsible for the generating and transmitting the captured data. Further analysis showed a call to 00401935, which based on bit arithmetic operation I identified as the encryption routine. The encryption routine takes a few arguments, namely an input string in EAX (which was the unencrypted data), an output string in EDX, and the length of the input string in ECX. The 50,000 foot view on this function is that it will step through each byte, perform some bit operations on it then use the result as an index in to an array containing A-Z, a-z, 0-9 and "+/". It will continue this loop until all the bytes are encrypted.

 Detailed Dissection

The dead listing for the entire encryption routine is available here. We will analyze the routine block-by-block to figure out exactly what the function is doing in more detail. Note: Not every line of code is accounted for in the block excerpts displayed before. See the dead listing.

00401935 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦
00401935
00401935 EncryptLogs
00401935     push    ebx
00401936     push    esi
00401937     push    edi
00401938     push    ebp
00401939     mov     edi, offset aAbcdefghijklmn ;
0040193E     xor     esi, esi
00401940     test    ecx, ecx
00401942     jle     loc_4019D8

The first block (00401935 - 00401942) is really basic. It.s saving the registers, setting up variables and testing whether ECX is greater than 0. ECX is the length of the string to encrypt. We know this because the input string is passed into a call to strlen() with the result saved in ECX just prior to the call to the encryption routine. It is also of importance to note that EDI now points to the alphanumeric encoding string (the one that contains A-Z, a-z, 0-9 and "+/") that is indexed into for obfuscation.

00401948 loc_401948:
00401948     cmp     esi, 20
0040194B     jbe     short loc_401957
0040194D     mov     byte ptr [edx], 0Dh
00401950     inc     edx
00401951     mov     byte ptr [edx], 0Ah
00401954     inc     edx
00401955     xor     esi, esi

The second block (00401948 - 00401955) is simply for formatting the encrypted data so it will not be transmitted as one long line. The ESI register keeps a running count as the number of bytes that were encrypted. When it reaches 20, CRLF is inserted into the encrypted stream and ESI is reset.

00401957 loc_401957:
00401957     xor     ebx, ebx
00401959     mov     bl, [eax]
0040195B     sar     ebx, 2
0040195E     mov     bl, [edi+ebx]
00401961     mov     [edx], bl

The next block (00401957 - 00401961) is also pretty basic but what is really going on here? Remembering that EAX points to the string to be encrypted, we see that it is clearing EBX, and then loading the first character into BL. It then shifts EBX to the right 2. Let's work out an example. Assume the first byte is 'I' which in binary ASCII representation looks like 01001001. When we shift that to the right 2 places, we get 00010010 and as you can see, we lost 2 bits. The routine uses that resulting value as an index into the alphanumeric encoding string, and then stores it at the location pointed to by EDX.

00401963     xor     ebx, ebx
00401965     mov     bl, [eax]
00401967     inc     edx
00401968     and     ebx, 3
0040196B     shl     ebx, 4
0040196E     cmp     ecx, 1
00401971     jle     short loc_40197C
00401973     movzx   ebp, byte ptr [eax+1]
00401977     sar     ebp, 4
0040197A     jmp     short loc_40197E
0040197C ; ------------------------------------------------
0040197C loc_40197C:
0040197C     xor     ebp, ebp
0040197E
0040197E loc_40197E:
0040197E     or      ebx, ebp
00401980     mov     bl, [edi+ebx]
00401983     mov     [edx], bl

The next block starting at 00401963 and ending at 00401983, starts in a similar way as the previous block. It clears EBX, loads the clear text first character in BL. It then increments the output string pointer then AND's EBX with 3 and then shifts it left by 4. Again let's work out an example. Decimal 3 is represented in binary as 00000011 and ASCII 'I' is represented in binary as 01001001. So 01001001 AND 00000011 = 00000001, that is then shifted to the left four positions resulting in 00010000. What that is doing is storing the 2 Least Significant Bits (LSB) of the first byte and then shifting them to the left 4. This is interesting because those are the very bits we lost from the earlier byte manipulation detailed above. A check is then made to ensure that the current byte is not the last byte of the input string. We will not worry about that right now. Jumping to 00401973 we see it's taking the next byte and shifting it right 4 then OR'ing those 2 bytes together. Once again lets work it out by example. Assume the second byte is ASCII 'D' which is represented in binary as 01000100. Shifting that right by 4 places gives us 00000100. After the OR against 00010000 we have 00010100, which are the first 2 bits of the first byte and the last 4 bits of the second byte. As with the first byte manipulation we lose a few bits, in this case the 4 Most Significant Bits (MSB). However, we now know the missing bits from the first manipulation on the first byte. Based on how the first and second bytes are encrypted we are able to recover the first byte through the following steps. First we obtain the index of the first and second bytes within the alphanumeric encoding string. Next, we shift the first index to the left 2 bits, following our example with 'I' we get 01001000. This is most of the first byte, except for the 2 Least Significant Bits (LSB). To get those missing bits we shift the second index to the right 4 which again following our example results in 00000001. Finally, we AND the two results together to get the first byte decrypted in it's entirety, 01001001 == 'I'.

Now that we know how the first byte is decrypted, and how the second byte is encrypted, we can scan ahead and see if the other bytes are handled the same, and what do you know, they are. Although they are done slightly different, they are very similar.

0040198D     mov     bl, [eax+1]
00401990     shl     ebx, 2
00401993     and     ebx, 3Ch
00401996     cmp     ecx, 2
00401999     jle     short loc_4019A4
0040199B     movzx   ebp, byte ptr [eax+2]
0040199F     sar     ebp, 6
004019A2     jmp     short loc_4019A6
004019A4 ; ------------------------------------------------
004019A4 loc_4019A4:
004019A4     xor     ebp, ebp

In the block at 0040198D to 004019A6 we see the second byte get shifted to the left 2 then AND'd with 0x3c. Again, we'll do this out; 'D' = 01000100 << 2 = 00010000, 0x3c = 00111100 so 00010000 & 00111100 = 00010000. Then it takes the third byte and shirts it right 6, the third byte which is ':' = 00111010, after a shift right 6 is 00000000. So OR'ing those together we get, 00010000. So after those operations, we now know the first 4 bits of the second byte, but we lost the first 6 bits of the third byte. In a similar way as we decrypted the first byte, we can decrypt the second byte, first we find the index of the second and third byte in the alphanumeric string, then shift the second index to the left 4, then take the third index and shift it to the right 2 and then AND the results together to get the second decrypted byte.

004019B9     mov     bl, [eax+2]
004019BC     and     ebx, 3Fh
004019BF     mov     bl, [edi+ebx]

By now, you should have a guess on how the third byte will be decrypted, but we will work it out anyway. In the few lines at 004019B9 - 004019BF, we see the third byte get loaded into BL then AND'd with 0x3f. That was short and basic but we'll work it out anyway. The third byte which is 00111010 and 0x3F is 00111111, so 00111010 & 00111111 = 00111010. Now this is basically the byte in clear text, but we have lost the last 2 bits, which if you remember from the last code block, it was shifted to the right 6, leaving just the last 2 bits. So, to decrypt the third byte we shift the third byte to the left 6, and then OR it against the fourth byte.

So know we know how to decrypt the file for the most part, but what about all those cmp ECX's? If you remember from before ECX was the string length, so it is checking to make sure we are not going to run past the end of the string and try to grab a byte that is not there. So, if the byte count is not what it should be to get the job done as in the previous iterations, it will either OR the value with 0x0 or use 0x3D which is '='.

 Constructing a Decryption Routine

Using what we know, we can create a small decryption routine like the following:

void decrypt (char *in)
{
    char first, second, third, fourth;
    int lf_count = 0,
        i        = 0;
    int byte_count = strlen(in);

    while (byte_count > 0)
    {
        first  = index(&in[i]);
        second = index(&in[i + 1]);
        third  = index(&in[i + 2]);
        fourth = index(&in[i + 3]);

        if (byte_count > 2)
            first = (first << 2) | (second >> 4);
        else
            first = (first << 2);

        printf("%c", first);

        if (byte_count > 2)
        {
            if (byte_count > 3)
                second = (second << 4) | (third >> 2);
            else
                second = (second << 4);
        }

        printf("%c", second);

        if (byte_count > 3)
        {
            third = (third << 6) | fourth;
            printf("%c", third);
        }

        i += 4;
        byte_count -= 4;
        lf_count++;
    }
}

Article Comments Write Comment / View Complete Comments

    Username Comment Excerpt Date
  Flow I'm also looking at a variant of winldra. Can y... Friday, February 10 2006 17:55.02 CST
rfreeman Actually, I looked at several malicious applica... Thursday, October 13 2005 00:31.41 CDT
Gerry Lostit, Yes it was a Base64 encoding, thank yo... Monday, August 29 2005 10:07.46 CDT
lostit It was a well written article, that goes into s... Saturday, August 27 2005 12:16.14 CDT

There are 31,310 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
hi!
Jul/01
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit