Topic created on: December 1, 2005 19:37 CST by ryanlrussell  .
I'm hoping that someone who is more used to reading ARM can help me out. I'm looking at this:
ROM:000001A4 sub_1A4 ; CODE XREF: sub_2A7C+3Cp
ROM:000001A4 ; program_flash+3Cp ...
ROM:000001A4
ROM:000001A4 var_10 = -0x10
ROM:000001A4 oldR11 = -0xC
ROM:000001A4 oldSP = -8
ROM:000001A4 oldLR = -4
ROM:000001A4
ROM:000001A4 MOV R12, SP
ROM:000001A8 STMFD SP!, {R11,R12,LR,PC}
ROM:000001AC SUB R11, R12, #4
ROM:000001B0 SUB SP, SP, #4
ROM:000001B4 STR R0, [R11,#var_10]
ROM:000001B8 MOV R0, #0
ROM:000001BC B return
ROM:000001C0
ROM:000001C0 return
ROM:000001C0 LDMDB R11, {R11,SP,PC}
ROM:000001C0 ; End of function sub_1A4
The way IDA seems to auto-analyze it and name things, the naive reading says that it takes an arg in R0, sticks it in a stack variable, and then returns 0. That doesn't make a lot of sense to me, since that would make it a NOP function. I suspect that it is taking a function address in R0, and branching to it? If so, I'm not sure why the calling function wouldn't just do a BL directly, but I've seen compilers do sillier things.
Anyway, I'm hoping someone who reads ARM in their sleep can spell it out for me, thanks.
If anyone is curious, it's for some discontinued WAP that someone is porting uCLinux to. He has the kernel running, he's trying to grok the flash erase/program code, so I'm trying to help reverse it.
you are correct, it does nothing.
|
I'd say it is a function returning zero:
int zero(int arg) { return 0; }
|
So, it exists to make me look bad and waste my time? That's my wife's job!
That's for the replies. So, I spent a bunch of time trying to figure out why I couldn't figure it out. I had to make myself this manual trace:
stack intially = 0x1000
0x1000 ????
ROM:000001A4 MOV R12, SP R12 = 0x1000
ROM:000001A8 STMFD SP!, {R11,R12,LR,PC} SP = 0x09F0
STMFD! means decrement before, decending, on a full stack, and update SP when done
IF SP starts at 0x100, decrement it first to 0x09FC, write out the list of registers
in "descending" order (smaller memory addresses), and set SP to 0x09F0 when done.
stack now:
0x1000 ????
0x09FC PC
0x09F8 LR
0x09F4 R12 (SP)
0x09F0 R11
ROM:000001AC SUB R11, R12, #4 R11 = 0x09FC
ROM:000001B0 SUB SP, SP, #4 SP = 0x09EC
stack now:
0x1000 ????
0x09FC PC
0x09F8 LR
0x09F4 R12 (SP)
0x09F0 R11
0x09EC ????
ROM:000001B4 STR R0, [R11,#-0x10] Shove arg0 into 0x9EC
stack now:
0x1000 ????
0x09FC PC
0x09F8 LR
0x09F4 R12 (SP)
0x09F0 R11
0x09EC arg0
ROM:000001B8 MOV R0, #0 R0 = 0
ROM:000001BC B return
ROM:000001C0
ROM:000001C0 return
ROM:000001C0 LDMDB R11, {R11,SP,PC}
LDMDB means decrement before empty ascending. R11 is restore, SP is restore, PC gets LR.
So in short, you two (and IDA itself) are all perfectly correct.
I'm mostly posting this reply in case some other poor sucker in the future decides to try and figure out how the ARM stack works. Well, it works about 32 ways, if you multiply out all the combinations you could do. I had to look at about a dozen web pages and two books before I started to get it.
The docs tend to refer to "down" without making it explicit that the ARM guys tend to write their memory addresses getting smaller in the down direction, for the stack. Which is just intuitively backwards if you're used to Intel.
My brain hurts now. How dare you make me learn, ARM!
|
another arm stack example: showing how varargs work.
printf
00: MOV R12, SP
04: STMFD SP!, {R0-R3}
08: STMFD SP!, {R12,LR}
0c: ADD R1, SP, #0xC
10: STR R0, [SP,#8]
14: BL vprintf
18: LDMFD SP, {SP,PC}
this is what the stack looks like:
---SP.00
R3.00
R2.00
R1.00 <-- R1.10
R0.00 <-- SP.08
LR.00
R12.04 == SP.00 <-- SP.0c
<regname>.<offset> = value of <regname> before executing line <offset>
note that the store at offset 10 is pointless, that stack location already contains R0
in C it looks like this:
int printf(char *fmt, ...) {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
va_end(ap);
}
|
this one shows that when passing more than 4 parameters,
the stack locations are reused, and change meaning for the different calls that use more than 4 parameters
STMFD SP!, {R5-R6,LR}
SUB SP, SP, #0x10
MOV R0, #0x80
MOV R1, #3
STR R0, [SP,#0x20+var_1C] ; dwFlagsAndAttributes
STR R1, [SP,#0x20+var_20] ; dwCreationDisposition
MOV R6, #0
LDR R0, =aBag0 ; lpFileName
MOV R3, #0 ; lpSecurityAttributes
MOV R2, #0 ; dwShareMode
STR R6, [SP,#0x20+var_18] ; hTemplateFile
MOV R1, #0 ; dwDesiredAccess
BL CreateFileW
MOV R5, R0
STR R6, [SP,#0x20+var_14] ; lpOverlapped
STR R6, [SP,#0x20+var_18] ; lpBytesReturned
MOV R1, #2 ; dwIoControlCode
STR R6, [SP,#0x20+var_1C] ; nOutBufSize
MOV R3, #0 ; nInBufSize
STR R6, [SP,#0x20+var_20] ; lpOutBuf
MOV R2, #0 ; lpInBuf
MOV R0, R5 ; hDevice
BL DeviceIoControl
for convenience i usually name the stacklocations used for passing parameters arg4, arg5, .. etc
willem
|
Well, it's been some time. I have a problem in a similar context so I just post my question here. I hope it is okay.
Please, have a look at this small function. I can't get my head around what it does. Does this ring a bell for someone?
Isn't line4 completely useless as R11 is set again in line5?
Isn't the entire function useless? :)
And please bare with me if this seems trivial. I'm under pressure and my brain is aching.
; int __cdecl mysub(int, int param_R0, int param_R1, int param_R2, int param_R3)
mysub
oldR11 = -0xC
oldSP = -8
oldLR = -4
param_R0 = 4
param_R1 = 8
param_R2 = 0xC
param_R3 = 0x10
[line1] MOV R12, SP
[line2] STMFD SP!, {R0-R3}
[line3] STMFD SP!, {R11,R12,LR,PC}
[line4] SUB R11, R12, #0x14
[line5] LDMFD SP, {R11,SP,PC}
; End of function mysub
Any comments are very welcome :)
|
|
Yes, it's just an unoptimized nullsub. lines 1-4 are function prolog (setting up stack frame) and 5 is the epilog.
|
This reminds me of reverse-engineering MIPS code.
Most horrible architectural fluke ever: first instruction after a branch is actually executed!
|
> AlexIonescu: This reminds me of reverse-engineering MIPS code.
>
> Most horrible architectural fluke ever: first instruction after a branch is actually executed!
The point of a processor architecture is not to please ASM coders,
but to be efficient. And the branch delay slot of MIPS, SPARC and
PA-RISC processors avoids stalling the pipeline when a branch
is taken, which makes them more efficient.
|
As an exception to the rule and to add confusion MIPS also has branch likely instructions, which either: -
a) only execute the instruction in the delay slot if its branching; or
b) only execute the instruction in the delay slot if its not branching.
I can never remember which, so I have to look it up.
mspath is correct, the delay slots avoid pipeline stalls and makes the cpu architecture simpler. The onus is on the compiler to make smart choices.
I'm not sure, but I've heard something about x86 cores actually implementing the x86 CISC ISA with RISC micro-ops.
Either-way from my limited understanding despite RISC being simpler architecture that in theory should be able to implement higher clock speeds, the x86 CISC chips have been able to surpass them. At least in the case of MIPS and PowerPC vs x86.
|
|
Actually if I recall correctly, with a Branch Likely instruction the instruction in the delay slot is always executed, but the write-back only occurs if its branching, if its not branching the write-back is invalidated.
|
Note: Registration is required to post to the forums.
|