Linux下VFP NEON浮点编译
来源:互联网 发布:c 做单片机上位机 编辑:程序博客网 时间:2024/06/12 18:52
NEON:SIMD(Single Instruction Multiple Data 单指令多重数据) 指令集, 其针对多媒体和讯号处理程式具备标准化的加速能力。
VFP: (Vector Float Point), 向量浮点运算单元,arm11(s3c6410 支持VFPv2),Cortex-A8(s5pv210)支持VFPv3.
NEON和VFPv3 浮点协处理器共享寄存器组,所以在汇编时,指令是一样的。
编译选项:
-mfpu = name(neon or vfpvx)指定FPU 单元
-mfloat-abi = name(soft、hard、 softfp):指定软件浮点或硬件浮点或兼容软浮点调用接口
如果只指定 -mfpu,那么默认编译不会选择选择硬件浮点指令集
如果只指定 -mfloat-abi = hard或者softfp,那么编译会使用硬件浮点指令集
测试C文件
int main(void){float f1, f2, f3;f1 = 1.2;f2 = 1.3;f3 = f1 / f2;return 0;}
1、 arm-eabi-gcc -S hello.c -mfpu=neon
.arch armv5te.fpu softvfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".global__aeabi_fdiv.text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0stmfdsp!, {fp, lr}.save {fp, lr}.LCFI0:.setfp fp, sp, #4addfp, sp, #4.LCFI1:.pad #16subsp, sp, #16.LCFI2:ldrr3, .L3@ floatstrr3, [fp, #-16]@ floatldrr3, .L3+4@ floatstrr3, [fp, #-12]@ floatldrr0, [fp, #-16]@ floatldrr1, [fp, #-12]@ floatbl__aeabi_fdivmovr3, r0strr3, [fp, #-8]@ floatmovr3, #0movr0, r3subsp, fp, #4ldmfdsp!, {fp, pc}.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
2、 arm-eabi-gcc -S hello.c -mfpu=vfp
.arch armv5te.fpu softvfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".global__aeabi_fdiv.text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0stmfdsp!, {fp, lr}.save {fp, lr}.LCFI0:.setfp fp, sp, #4addfp, sp, #4.LCFI1:.pad #16subsp, sp, #16.LCFI2:ldrr3, .L3@ floatstrr3, [fp, #-16]@ floatldrr3, .L3+4@ floatstrr3, [fp, #-12]@ floatldrr0, [fp, #-16]@ floatldrr1, [fp, #-12]@ floatbl__aeabi_fdivmovr3, r0strr3, [fp, #-8]@ floatmovr3, #0movr0, r3subsp, fp, #4ldmfdsp!, {fp, pc}.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
可以看到上面两个例子,使用的是 .fpu softvfp
3、 arm-eabi-gcc -S hello.c -mfpu=neon -mfloat-abi=hard
.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu neon.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
4、 arm-eabi-gcc -S hello.c -mfpu=neon -mfloat-abi=softfp
.arch armv5te.eabi_attribute 27, 3.fpu neon.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
5、 arm-eabi-gcc -S hello.c -mfpu=vfpv3 -mfloat-abi=softfp
.arch armv5te.eabi_attribute 27, 3.fpu vfpv3.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
6、 arm-eabi-gcc -S hello.c -mfpu=vfpv3 -mfloat-abi=hard
.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu vfpv3.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
从上面可以看到,使用softfp和hard使用的指令集是一样的,都是硬件浮点, neon和vfp的区别,仅仅体现在.fpu vfpv3和.fpu neon.
7、 arm-eabi-gcc -S hello.c -mfloat-abi=hard
.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu vfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits
当直接使用-mfloat-abi=hard时,会默认使用.fpu vfp硬件浮点。
3 0
- Linux下VFP NEON浮点编译
- Linux下VFP NEON浮点编译
- ARM vfp neon 浮点优化
- OpenBlas, Neon & VFP
- ARM多媒体编程与浮点计算(NEON和VFP)指令集
- ffmpeg neon优化必看!!android下编译ffmpeg with neon的正确方法
- arm-vfp-linux-gnu交叉编译工具安装
- arm-vfp-linux-gnu交叉编译工具安…
- [ARM]【编译】【实践】 - 浮点编译选项NEON引发的Skia的库Illegal instruction运行错误和解决办法
- ARM NEON编译优化
- ARM NEON编译错误
- linux下安装gcc交叉编译工具及其“浮点数例外”
- ARMV4,ARMV4T,ARMV4I, ARMv5te,armv6,ARM VFP,ARM neon
- ffmpeg neon优化必看!!android下编译ffmpeg with neon的正确方法(已验证)
- ffmpeg neon优化必看!!android下编译ffmpeg with neon的正确方法(已验证)
- ffmpeg neon优化必看!!android下编译ffmpeg with neon的正确方法(已验证)
- ffmpeg neon优化必看!!android下编译ffmpeg with neon的正确方法(已验证)
- ARM11 S3C6410 硬件浮点(VFP)实现
- 第七届河南省ACM A题
- 使用git建立远程仓库,让别人git clone下来
- arm板子和单片机进行串口通讯
- 程序猿们
- HTML常用代码总结
- Linux下VFP NEON浮点编译
- android 类似google Play Tab
- 剑指offer-sizeof
- 杭电 1230 火星A+B
- Oracle 进程结构
- 多邻国 www.duolingo.com 的使用初体验
- 自由创蚁-青少年积木式编程平台正式发布了!
- JavaMail 的体系结构与API分类
- NS3推荐学习