David Reguera Garca (Dreg) <Dreg fr33project org> |
Wednesday, December 31 2008 23:28.40 CST |
Note: In a few days I will add more to this post.
One "CreateRemoteThread into not yet initialized process" Problem:
1) MSDN CreateRemoteThread: Only one of these events occurs in an address space at a time. This means the following restrictions hold:
- During process startup and DLL initialization routines, new threads can be created, but they do not begin execution until DLL initialization is done for the process.
- Only one thread in a process can be in a DLL initialization or detach routine at a time.
- ExitProcess returns after all threads have completed their DLL initialization or detach routines.
2) If you creates a remote thread into not yet initialized process I found a problem in GUI programs, for example notepad.exe:
- Remote thread initialize GDI32.DLL and USER32.DLL (the main thread not initialize the DLLs, It is in suspend state and not initialize never).
- If the remote thread exit before main thread calls CreateWindow/Ex,
the call fails with LastError ERROR_CANNOT_FIND_WND_CLASS (0x57F) <-- this is the problem.
- If the remote thread never exit, the calls to CreateWindow/Ex works fine.
I researched this bug/feature/..., the ERROR_CANNOT_FIND_WND_CLASS is set from ring 0 (kernel mode) in win32k!NtUserCreateWindowEx:
The ring 0 win32k Call Stack where the ERROR_CANNOT_FIND_WND_CLASS (0x57F) is set to user-space LastError (TEB+34h):
f818cb90 bf83e8ef win32k!UserSetLastError
f818cc6c bf834a37 win32k!xxxCreateWindowEx+0xf3
f818cd20 8054162c win32k!NtUserCreateWindowEx+0x1c1
f818cd20 7c91e4f4 nt!KiFastCallEntry+0xfc
The ERROR_CANNOT_FIND_WND_CLASS is because the win32k!UserFindAtom fails:
win32k!xxxCreateWindowEx:
...
bf81b621 57 push edi
bf83de96 e81eecfcff call win32k!UserFindAtom (bf80cab9)
bf83de9b 6685c0 test ax,ax
bf83de9e 8945e4 mov dword ptr [ebp-1Ch],eax
bf83dea1 0f84d9c40c00 je win32k!xxxCreateWindowEx+0x1d8 (bf90a380)
...
bf90a380 687f050000 push 57Fh
bf90a385 e96045f3ff jmp win32k!xxxCreateWindowEx+0xee (bf83e8ea)
...
bf83e8ea e89c4efcff call win32k!UserSetLastError (bf80378b)
win32k!UserFindAtom internally calls RtlLookupAtomInAtomTable and the search fails, class not found:
win32k!UserFindAtom:
bf80cab9 8bff mov edi,edi
bf80cabb 55 push ebp
bf80cabc 8bec mov ebp,esp
bf80cabe 51 push ecx
bf80cabf 8365fc00 and dword ptr [ebp-4],0
bf80cac3 8d45fc lea eax,[ebp-4]
bf80cac6 50 push eax
bf80cac7 ff7508 push dword ptr [ebp+8]
bf80caca ff35f8969abf push dword ptr [win32k!UserAtomTableHandle (bf9a96f8)]
bf80cad0 ff15acd098bf call dword ptr [win32k!_imp__RtlLookupAtomInAtomTable (bf98d0ac)]
bf80cad6 85c0 test eax,eax
bf80cad8 7c08 jl win32k!UserFindAtom+0x21 (bf80cae2)
bf80cada 668b45fc mov ax,word ptr [ebp-4]
bf80cade c9 leave
bf80cadf c20400 ret 4
bf80cae2 50 push eax
bf80cae3 e8c8950200 call win32k!SetLastNtError (bf8360b0)
bf80cae8 ebf0 jmp win32k!UserFindAtom+0x27 (bf80cada)
.. And the why??? in a few days :-).
HAPPY NEW YEAR 2009, Dreg.
|
http://www.apihooks.com/EliCZ/bugs/USERBug.zip |
QueueUserAPC() issues:
On XP- there was a chance to queue user APC even before LdrInitializeThunk APC (I can reproduce it on my old PC), so you ended up in an "empty or truly_not_initialized" process. On XP+ one of NtAPC parameters can be pActivationContext (a local pointer) so QueueUserAPC() to other process' thread is not good; NtQueueApcThread should be used.
|
Great, thanks for your expert info.
I tested it working fine on both XP sp2, and Vista (both 32bit only), but didn't look deep enough.
I should have put break points on LdrInitializeThunk(), etc., to verify the right sequence.
Will try "NtQueueApcThread()" here. |
My intention is to clarify exactly why it fails. I know other methods that work well (such as that used in the latest versions Detours). What I am trying to clarify this POST is exactly the problem. I want to know whether it is feasible to use any trick with CreateRemoteThread to work well with GUI apps.
I've drawn several conclusions with WinDBG. I'm making a list of differences such as:
- With CreateRemoteThread:
.- Main Thread -> TEB -> Remote Win32ThreadInfo == 0.
.- RemoteThread -> TEB -> Remote Win32ThreadInfo == VALID POINTER.
and others like the problem (in deep) with the LPC registration...
I also debugged initialization user32.dll, win32k, gui32.dll ...
Sincerly, Dreg. |
Sirmabus:
Put bp on KiUserApcDispatcher,
call CreateThread() or CreateProcess() from thread of lower priority && call (Nt)QueueUserAPC(hNewThread) from higher priority thread (maybe it will give "better" results when the threads run on different processors. I remember it helped when there was time-very-consuming thread on the background ~ search for some string in files).
Try it several times.
If you "succeed" (due to time window between thread creation and QueueApc(LdrInitializeThunk)) your apc is queued before LdrinitializeThunk and it will arrive before it - you will see it on KiUserApcDispatcher bp.
First I was solving it by WaitFor or Sleep, later by better "remote" code that checks PEB.Ldr and if it is NULL it queues itself again and ends so that the LdrInitializeThunk is executed before my apc:
"
Scout_StandardProcessing:
test byte ptr [ecx._REX_EB.Scout_BaseFlags], AH6X_REXE_FL_POSTINIT
mov eax, ecx
je Scout_CanProcess
mov edx, fs:[30h]
cmp dword ptr [edx +12], 0
jne Scout_CanProcess
call Scout_SetPostInit
Scout_ApcQuit:
retn 12
Scout_SetPostInit:
push eax
push eax
mov ecx, [eax._REX_EB.Scout_pNtQueueApcThread]
mov edx, [ecx]
push eax
add edx, [eax._REX_EB.Scout_NtDllBase]
push eax
push ASM_hCurrentThread
call edx
retn
"
Dreg:
http://groups.google.com/group/microsoft.public.win32.programmer.kernel/browse_thread/thread/1a2ec9ccd98d3979/548ed4dbc9970994
"The particular problems observed by Prasad are due to window classes being
registered by DllMain routines in response to DLL_PROCESS_ATTACH which are
then discarded by win32k when the count of active win32k client threads falls
to zero. Code that expects the classes to exist often fails in unpredictable
ways."
And there were more such discussion threads in 2000.
Another thing I remember from times before USERbug: Reed Mangino from NuMega was pointing out that MS Outlook stores a specific data in TLS of the 1st thread that comes along. If it is the remote thread and terminates, the data is lost.
In my opinion it was a bad MS Outlook design (1st thread is not special).
|
Very useful information.
I am thinking of adding more technical details. In Madshi.net forums, I found something like the MS Outlook:
http://forum.madshi.net/viewtopic.php?t=293
"
(1) You start Calc.exe with CREATE_SUSPENDED.
(2) You call CreateRemoteThread on Calc.
(3) All dlls are called with DLL_PROCESS_ATTACH in the context of your remote thread.
(4) Delphi dlls now do "MainThreadID := GetCurrentThreadID".
(5) Your remote thread ends.
(6) Delphi dlls try to contact the main thread e.g. by doing PostThreadMessage(MainThreadID, ...).
(7) This doesn't work, cause the thread has already terminated. Or maybe there's even a new thread (of another process) which now reuses the ID that your remote thread had.
" |
More information:
when Remote Thread finish, all Registered Classes in the process are Destroyed:
Call stack of the Remote Thread in ring0-ring3 finishing:
win32k!DestroyProcessesClasses
win32k!xxxDestroyThreadInfo+0x21d
win32k!UserThreadCallout+0x4b
win32k!W32pThreadCallout+0x3d
nt!PspExitThread+0x3f3
nt!PspTerminateThreadByPointer+0x52
nt!NtTerminateThread+0x70
nt!KiFastCallEntry+0xf8
ntdll!KiFastSystemCallRet
ntdll!ZwTerminateThread+0xc
kernel32!ExitThread+0x8b
kernel32!BaseThreadStart+0x3c
This is the cause which the DoRemoteFail.exe (POC of EliCZ) fails sometimes.
The Class is Registered by DllMain of CreateClass.DLL, this dll is Load by the Win Loader (The dll is a import of UseCreateClass.exe (created by DoRemoteFail.exe)), and sometimes the RemoteThread finished after of the registration of the class, and this situation is when all Registered Classes in the process are Destroyed by win32k!DestroyProcessesClasses.
I coded a POC which Register the class when the Remote Thread is Finished and the CreateWindow works fine always.
All clases are destroyed by win32k!DestroyProcessesClasses because:
bf8bc411 e85ffaffff call win32k!FLastGuiThread (bf8bbe75)
bf8bc416 85c0 test eax,eax
bf8bc418 7424 je win32k!xxxDestroyThreadInfo+0x22c (bf8bc43e)
bf8bc41a 8b462c mov eax,dword ptr [esi+2Ch]
bf8bc41d f6400b01 test byte ptr [eax+0Bh],1
bf8bc421 0f85acfdffff jne win32k!xxxDestroyThreadInfo+0x1fe (bf8bc1d3)
bf8bc427 ff762c push dword ptr [esi+2Ch]
bf8bc42a e87b430000 call win32k!DestroyProcessesClasses (bf8c07aa)
The je/jne not jumps in this scenario and the DestroyProcessesClasses is executed.
PD: also remember, when the primary thread RegisterClass before remote thread finish, the class are destroyed by win32k!DestroyProcessesClasses (like the last scenario), one example of this problem is when you create a remote thread in the notepad.exe (created SUSPENDED), the notepad.exe sometimes fails and the window is not created.
Also i look the Remote2Primary POC of EliCz and this poc, creates a remote thread suspended and set the CONTEXT of main thread to remote thread and run the remote thread (also the poc terminate the main thread), the problem is.. I Try to code a POC which use this technique but creating the remote thread without CREATE_SUSPENDED (If you cant execute payload before the real code, this method (some times) makes no sense). Coding and try a few hours... without get a stable method for: "Executing a payload with remote thread and after running this thread like the main thread without problems".
PD: Remember, if the remote thread is created with CREATE_SUSPENDED, and you run the main thread before the remote, all works fine because the remote is not the first thread registered.
Sincerely, Dreg. |
|