Lab 2：ARM指令

来源：互联网发布：西交大网络学校编辑：程序博客网时间：2024/05/19 00:14

一、选用的实验器材

RaspberryPi（树莓派）一块、USB-TTL串口线一根（PL2303芯片）、以太网线一根、8G容量SD卡一张、带windows7操作系统的PC一台。

二、连接示意图

三、实物连接图

四、实验过程

1、使用交叉编译环境生成arm指令和thumb指令

在Ubuntu环境下编写一个如下所示的计算乘法的C程序：

随后用交叉编译工具进行编译（不给额外的参数）：

在lab2目录下生成一个a.out文件，把这个文件从虚拟机拷贝到windows环境下，再通过FileZilla Client发送给树莓派（已经通过超级终端连接PC），随后进入树莓派的对应目录下，并且使用gdb命令对a.out文件进行调试，输入disassemble main进行反汇编结果如下：

解释生成的汇编代码如下：

0x000083b0 <+0>: push {r11} ; (str r11, [sp,#-4]!) save fp

0x000083b4 <+4>: add r11, sp, #0 ;setr11(which is just fp)=sp

0x000083b8 <+8>: sub sp, sp, #12

0x000083bc <+12>: mov r3, #10 ;set r3=10,which is the first argument

0x000083c0 <+16>: str r3, [r11, #-12] ;store r3 to memory

0x000083c4 <+20>: mov r3, #2 ;set r3=2,which isthe second argument

0x000083c8 <+24>: str r3, [r11, #-8] ;store r3 to memory

0x000083cc <+28>: ldr r2, [r11, #-12] ;get value 10

0x000083d0 <+32>: ldr r1, [r11, #-8] ;get value 2

0x000083d4 <+36>: mul r3, r1, r2 ;2*10=20

0x000083d8 <+40>: mov r0, r3 ;get final result

0x000083dc <+44>: add sp, r11, #0 ;setsp=fp

0x000083e0 <+48>: ldmfd sp!, {r11} ;get back the value to r11

0x000083e4 <+52>: bx lr ;return

我们看到，这里没两条指令之间PC相差4，可见一条指令的大小为4个字节，因此是arm指令集。

此后，重新回到Ubuntu环境下，重现交叉编译刚才的test.c文件，但这次加入-mthumb参数如下所示：

这时，先把树莓派上的a.out文件重命名为arm.out文件如下：

然后，和刚才的操作一样，把Ubuntu下刚才编译好的a.out文件弄到树莓派的lab2目录下：

使用gdb命令调试，得到如下反汇编结果：

解释生成的汇编代码如下：

0x000083b0 <+0>: push {r7, lr} ;save the value ofr7(fp) and lr

0x000083b2 <+2>: sub sp, #8 ;set sp=sp-8

0x000083b4 <+4>: add r7, sp, #0 ;set r7=sp

0x000083b6 <+6>: movs r3, #10 ;set r3=10,which isthe first argument

0x000083b8 <+8>: str r3, [r7, #0] ;store r3 to memory

0x000083ba <+10>: movs r3, #2 ;set r3=2,which isthe second argument

0x000083bc <+12>: str r3, [r7, #4] ;store r3 to memory

0x000083be <+14>: ldr r2, [r7, #0] ;get value 10

0x000083c0 <+16>: ldr r1, [r7, #4] ;get value 2

0x000083c2 <+18>: adds r3, r1, #0 ;set r3=r1=10

0x000083c4 <+20>: muls r3, r2 ;r3=r3*2=20

0x000083c6 <+22>: adds r0, r3, #0 ;get final result

0x000083c8 <+24>: mov sp, r7 ;set sp=r7

0x000083ca <+26>: add sp, #8 ;restore sp

0x000083cc <+28>: pop {r7} ;restore r7(fp)

0x000083ce <+30>: pop {r1} ;get return address

0x000083d0 <+32>: bx r1 ;return

可以看到，这里的每两条指令的地址相差2，说明一条指令的大小为2Byte，说明这确实是thumb指令。

相同的程序，ARM和Thumb编译的结果的不同所在：

首先thumb指令长度是16位的，比arm短了一半。但是指令数量相对增加。

此外，在thumb指令中无法使用像ldmfd这样的复杂指令。但thumb指令生成的最终

执行程序的大小比arm指令生成的相对小一些。

2、对于ARM 指令,能否产生条件执行的指令

直接在树莓派的环境下编写如下所示的C程序：

它会将输入中的较大的数加1，然后输出两个值。

使用-O2选项进行优化编译如下：

反汇编得到如下结果：

可以看到+44和+48两个地址处的命令都是条件执行的命令。

解释生成的汇编代码如下：

0x00010348 <+0>: push {lr} ; (str lr, [sp,#-4]!) save the return address

0x0001034c <+4>: sub sp, sp, #12 ;set sp=sp-12

0x00010350 <+8>: mov r1, sp ;value of a willbe saved to address in sp

0x00010354 <+12>: ldr r0, [pc, #60] ; 0x10398<main+80>

0x00010358 <+16>: bl 0x10330 ;to get the valueof a from input, the value is saved in memory ;addresscalculated by value in r1

0x0001035c <+20>: add r1, sp, #4 ;value of b willbe saved to address in sp+4

0x00010360 <+24>: ldr r0, [pc, #48] ; 0x10398<main+80>

0x00010364 <+28>: bl 0x10330 ;to get the valueof b from input

0x00010368 <+32>: ldm sp, {r1, r2} ;copy value of a andb to r1 and r2

0x0001036c <+36>: ldr r0, [pc, #40] ; 0x1039c<main+84>

0x00010370 <+40>: cmp r1, r2 ;compare value ofa and b

0x00010374 <+44>: addle r3, r2, #1 ;if a<=b b++

0x00010378 <+48>: addgt r1, r1, #1 ;else a>b a++

0x0001037c <+52>: strgt r1, [sp] ;and restore newvalue of a to related memory address

0x00010380 <+56>: movle r2, r3 ;in case ofa<b, r2 has the value of b+1

0x00010384 <+60>: strle r3, [sp, #4] ;in case ofa<b, restore new value of b to related memory address

0x00010388 <+64>: bl 0x1030c ;go to printinformation,the value is in sp and sp+4

0x0001038c <+68>: mov r0, #0 ;set return valueas 0

0x00010390 <+72>: add sp, sp, #12 ;restore sp

0x00010394 <+76>: pop {pc} ; (ldr pc, [sp],#4) get the return address

0x00010398 <+80>: andeq r0, r1, r12, lsr r5

0x0001039c <+84>: andeq r0, r1, r0, asr #10

3、设计 C 的代码场景,观察是否产生了寄存器移位寻址;

在树莓派环境下编写如下的C代码：

同样使用-O2优化编译：

得到的反汇编结果：

容易看到，+56地址处的指令是add r1, r1, r3, asr #2，这条指令的寻址方式用到了寄存器移位寻址。实际上是先把r3右移2位（除以4），再把r3相加，然后赋给r1。

解释生成的汇编代码如下：

0x00010348 <+0>: push {r4, lr} ;save returnvalue and r4

0x0001034c <+4>: sub sp, sp, #8 ;set sp=sp-8

0x00010350 <+8>: ldr r4, [pc, #60] ;0x10394<main+76>

0x00010354 <+12>: mov r1, sp ;value of awill be saved to address in sp

0x00010358 <+16>: mov r0, r4

0x0001035c <+20>: bl 0x10330 ;to get thevalue of a from input

0x00010360 <+24>: add r1, sp, #4 ;value of bwill be saved to address in sp+4

0x00010364 <+28>: mov r0, r4

0x00010368 <+32>: bl 0x10330 ;to get thevalue of b from input

0x0001036c <+36>: ldm sp, {r1, r3} ;save a and bto r1 and r3

0x00010370 <+40>: ldr r0, [pc, #32] ;0x10398<main+80>

0x00010374 <+44>: add r2, r3, #3 ;set r2=r3+3

0x00010378 <+48>: cmp r3, #0 ;check ifb<0

0x0001037c <+52>: movlt r3, r2 ;

0x00010380 <+56>: add r1, r1, r3, asr #2 ;we find aRegister shift addressing,which calculate a+b/4

0x00010384 <+60>: bl 0x1030c ;go to print

0x00010388 <+64>: mov r0, #0 ;set returnvalue as 0

0x0001038c <+68>: add sp, sp, #8 ;restore sp

0x00010390 <+72>: pop {r4, pc} ;restore r4 and return

0x00010394 <+76>: andeq r0, r1, r8, lsr r5

0x00010398 <+80>: andeq r0, r1, r12, lsr r5

4. 设计C 的代码场景,观察一个复杂的 32 位数是如何装载到寄存器的

在树莓派环境下编写如下所示的C程序：

不使用任何优化选项的情况下编译：

反汇编结果如下：

对于简单的32位数如0x10000001，它是直接用立即数寻址的方式赋值给r3寄存器的（在+12地址处的命令）。但是当处理复杂的32位数如0xabcd1234，则是把这个32位数的值先写在内存中（+56的寻址地址）。然后ldr r3, [pc, #28]这样的指令直接载入。为了证明这一点，使用x/2uh 0x10420命令直接查看main+56地址处的数值，可以看到显示了 4660(16进制下为1234) 43981(16进制下为abcd)的结果，证明上述分析是正确的。（上述+56的地方本来是数据0xabcd1234但是被gdb误认为是指令而进行翻译了）

对于所生成的汇编代码的解释如下：

0x000103e8 <+0>: push {r11} ; (str r11, [sp,#-4]!)

0x000103ec <+4>: add r11, sp, #0 ;set r11=sp

0x000103f0 <+8>: sub sp, sp, #12 ;set sp=sp-12

0x000103f4 <+12>: mov r3, #268435457 ;0x10000001 immediate addressing for simple32bits integer

0x000103f8 <+16>: str r3, [r11, #-8] ;store value of a

0x000103fc <+20>: ldr r3, [pc, #28] ;0x10420<main+56> has to load complex 32bits integer from memory

;base indexed addressing

0x00010400 <+24>: str r3, [r11, #-12] ;store value of b

0x00010404 <+28>: ldr r2, [r11, #-8] ;load a to r2

0x00010408 <+32>: ldr r3, [r11, #-12] ;load b to r3

0x0001040c <+36>: add r3, r2, r3 ;r3=a+b

0x00010410 <+40>: mov r0, r3 ;set return valueas r3(a+b)

0x00010414 <+44>: sub sp, r11, #0

0x00010418 <+48>: pop {r11} ; (ldr r11, [sp],#4)

0x0001041c <+52>: bx lr

0x00010420 <+56>: blge 0xff354cf8 ;wrong translationlike messy code, in fact it's data ;not instruction

End ofassembler dump.

(gdb)x/2uh 0x10420 ;to see the real value in 0x10420

0x10420<main+56>: 4660 43981 ;42981=0xabcd, 4660=0x1234, thecombine of them is the value of b

5、写一个C 的多重函数调用的程序,观察和分析

编写的C函数如右所示，其中main调用f2，然后f2调用f1。

得到的反汇编代码和对应解释如下

Dump ofassembler code for function f1:

0x00010470 <+0>: push {r11} ; (str r11, [sp,#-4]!) save value of caller's fp

0x00010474 <+4>: add r11, sp, #0 ;set r11(fp)=sp

0x00010478 <+8>: sub sp, sp, #20 ;set sp=sp-20

0x0001047c <+12>: str r0, [r11, #-16] ;save argument a 参数直接通过r0寄存器传入

0x00010480 <+16>: ldr r3, [r11, #-16] ;r3=a

0x00010484 <+20>: ldr r2, [r11, #-16] ;r2=a

0x00010488 <+24>: mul r3, r2, r3 ;r3=r2*r3=a*a

0x0001048c <+28>: add r3, r3, #2 ;r3=a*a+2

0x00010490 <+32>: str r3, [r11, #-8] ;save r3

0x00010494 <+36>: ldr r3, [r11, #-8] ;load r3 back

0x00010498 <+40>: mov r0, r3 ;set r0=r3=a*a+2, which is the returnvalue of f1()function

0x0001049c <+44>: sub sp, r11, #0 ;set sp=r11restore sp of caller

0x000104a0 <+48>: pop {r11} ; (ldr r11, [sp],#4) restore value of caller's fp

0x000104a4 <+52>: bx lr ;调试时的返回地址在lr, 是调用者保存的

End ofassembler dump.

Dump ofassembler code for function f2:

0x000104a8 <+0>: push {r11, lr} ;save value ofcaller's fp and caller's lr

0x000104ac <+4>: add r11, sp, #4 ;set r11(fp)=sp+4

0x000104b0 <+8>: sub sp, sp, #16 ;set sp=sp-16

0x000104b4 <+12>: str r0, [r11, #-16] ;save argument c 参数直接通过r0寄存器传入

0x000104b8 <+16>: ldr r0, [r11, #-16] ;get c back

0x000104bc <+20>: bl 0x10470 <f1> ;call f1()function, bl指令会在跳转前把下一条指令地址（子程序的 ;返回地址）拷贝到lr寄存器

0x000104c0 <+24>: mov r2, r0 ;set r2=ro,whichis the return of f1() function

0x000104c4 <+28>: mov r3, r2 ;set r3=r2=f1(c)

0x000104c8 <+32>: lsl r3, r3, #2 ;r3=4*f1(c)

0x000104cc <+36>: add r3, r3, r2 ;r3=r3+r2=4*f1(c)+f1(c)=5*f1(c)

0x000104d0 <+40>: lsl r3, r3, #1 ;r3=2*r3=10*f1(c)

0x000104d4 <+44>: str r3, [r11, #-8] ;save r3

0x000104d8 <+48>: ldr r3, [r11, #-8] ;get r3 back

0x000104dc <+52>: mov r0, r3 ;set r0=r3, whichis the return value of f2()function

0x000104e0 <+56>: sub sp, r11, #4 ;set sp=r11(fp)-4restore sp of caller

0x000104e4 <+60>: pop {r11, pc} ;restore value ofcaller's fp and caller's lr

End ofassembler dump.

Dump ofassembler code for function main:

0x000104e8 <+0>: push {r11, lr} ;save value ofcaller's fp and caller's lr

0x000104ec <+4>: add r11, sp, #4 ;set r11(fp)=sp+4

0x000104f0 <+8>: sub sp, sp, #8 ;set sp=sp-8

0x000104f4 <+12>: sub r3, r11, #12 ;set r3=r11(fp)-12

0x000104f8 <+16>: ldr r0, [pc, #48] ; 0x10530 <main+72>

0x000104fc <+20>: mov r1, r3

0x00010500 <+24>: bl 0x10330 ;go to scanf()

0x00010504 <+28>: ldr r3, [r11, #-12] ;set r3=input value

0x00010508 <+32>: mov r0, r3 ;set r0=r3=a,which is the input argument

0x0001050c <+36>: bl 0x104a8 <f2> ;call f2()function, bl指令会在跳转前把下一条指令地址（子程序的 ;返回地址）拷贝到lr寄存器

0x00010510 <+40>: str r0, [r11, #-8] ;now r0 has thevalue to be print

0x00010514 <+44>: ldr r0, [pc, #24] ; 0x10534<main+76>

0x00010518 <+48>: ldr r1, [r11, #-8]

0x0001051c <+52>: bl 0x1030c ;go to print thevalue of r1

0x00010520 <+56>: mov r3, #0 ;set r3=0

0x00010524 <+60>: mov r0, r3 ;set r0=r3=0,which is the return value

0x00010528 <+64>: sub sp, r11, #4 ;set sp=r11-4restore sp of caller

0x0001052c <+68>: pop {r11, pc} ;restore value ofcaller's fp and caller's lr

0x00010530 <+72>: andeq r0, r1, r12, lsr #11

0x00010534 <+76>: ; <UNDEFINED>instruction: 0x000105b0

End ofassembler dump.

a. 调用时的返回地址在哪里?

lr寄存器中

b. 传入的参数在哪里?

r0寄存器中

c. 本地变量的堆栈分配是如何做的?

对f1函数来说，从sp指针开始向下开辟栈空间。

d. 寄存器是 caller 保存还是 callee 保存?是全体保存还是部分保存?

都是部分保存。其中返回地址的值由caller放入lr寄存器（bl指令），r11（fp）、sp和lr由callee保存

6、尝试要如何写C 的表达式能编译得到MLA 指令

编写如下所示的C程序：

使用-O2选项进行优化编译：

再+44处可以看到mla指令。

对于生成的汇编指令的解释如下：

0x00010348 <+0>: push {lr} ;(str lr, [sp,#-4]!)

0x0001034c <+4>: sub sp, sp, #20

0x00010350 <+8>: add r1, sp, #4

0x00010354 <+12>: add r2, sp, #8

0x00010358 <+16>: add r3, sp, #12

0x0001035c <+20>: ldr r0, [pc, #36] ;0x10388<main+64>

0x00010360 <+24>: bl 0x10330 ;call scanf()

0x00010364 <+28>: ldr r2, [sp, #4] ;r2=a

0x00010368 <+32>: ldr r1, [sp, #8] ;r1=b

0x0001036c <+36>: ldr r3, [sp, #12] ;r3=c

0x00010370 <+40>: ldr r0, [pc, #20] ; 0x1038c<main+68>

0x00010374 <+44>: mla r1, r1, r2, r3 ;r1=a*b+c

0x00010378 <+48>: bl 0x1030c ;call printf()

0x0001037c <+52>: mov r0, #0 ;set return valueas 0

0x00010380 <+56>: add sp, sp, #20 ;restore caller'ssp

0x00010384 <+60>: pop {pc} ; (ldr pc, [sp],#4)

0x00010388 <+64>: andeq r0, r1, r12, lsr #10

0x0001038c <+68>: andeq r0, r1, r8, lsr r5

7、写C 的表达式能编译得到BIC 指令

编写如下所示的C程序:

用-O2选项进行优化编译:

在+32处看到了BIC指令。

对于生成的汇编代码的解释如下：

0x00010348 <+0>: push {r4, lr} ;save r4 andlr

0x0001034c <+4>: sub sp, sp, #8

0x00010350 <+8>: ldr r4, [pc, #32] ; 0x10378<main+48> get b=0xf0000000

0x00010354 <+12>: add r1, sp, #4

0x00010358 <+16>: mov r0, r4 ;set r0=b

0x0001035c <+20>: bl 0x10330 ;go to scanfto get value of a

0x00010360 <+24>: ldr r1, [sp, #4]

0x00010364 <+28>: mov r0, r4

0x00010368 <+32>: bic r1, r1, #-268435456 ; 0xf0000000BIC insetruction

0x0001036c <+36>: bl 0x1030c ;go to print

0x00010370 <+40>: add sp, sp, #8

0x00010374 <+44>: pop {r4, pc}

0x00010378 <+48>: andeq r0, r1, r8, lsl r5 ;this data0xf0000000 instead of an instruction

8、编写一个汇编函数，接受一个整数和一个指针做为输入，指针所指应为一个字符串，该汇编函数调用C语言的 printf()函数输出这个字符串的前n个字符，n即为那个整数。在C语言写的main()函数中调用并传递参数给这个汇编函数来得到输出。

在Ubuntu交叉编译环境下，编写如下所示的main.c文件和assemble.s文件：

main.c

#include<stdio.h>

extern void asm_print(int N,char* str); //汇编函数

int main()

{

int N;

charstr[30];

printf("start of main\n");

printf("Please input N:");

scanf("%d",&N); //输入一个整数n

printf("Please input string:");

scanf("%s",str); //输入一个字符串

asm_print(N,str);//调用汇编函数去打印字符串的前n个字符

printf("end of main\n");

return 0;

}

Assemble.s

.text

.global asm_print

.extern printf ;将要调用C函数printf

asm_print:

movr3,r1 ;移入printf的第四个参数，即被打印的字符串的首地址

movr1,r0 ;移入printf的第二个参数，为main.c中传入的N

movr2,r0 ;移入printf的第三个参数，同样是main.c中传入的N

adrr0,form ;载入printf的第一个参数，也即输出格式

stmfd sp!,{lr} ;调用C函数前先要保存lr寄存器值

blprintf ;调用C的printf函数

ldmfd sp!,{pc} ;恢复pc值

over:

movpc,lr ;将lr寄存器的值放入pc，从而回到调用了这个汇编代码的

;C函数

form:

.asciz "the first Nchars:%-*.*s\n"

.end

在交叉编译环境下进行混合编译：

将生成的main文件从虚拟机拷贝到windows环境下，再通过FileZilla Client上传到树莓派：

在树莓派上给该文件加上可执行权限并运行：

输入整数5，和字符串”abcdefghijklmn”

看到确实显示出了前5个字符”abcde”如下所示：

1 0