Linux下VFP NEON浮点编译

来源:互联网 发布:c 做单片机上位机 编辑:程序博客网 时间:2024/06/12 18:52

NEON:SIMD(Single Instruction Multiple Data 单指令多重数据) 指令集, 其针对多媒体和讯号处理程式具备标准化的加速能力。

VFP: (Vector Float Point), 向量浮点运算单元,arm11(s3c6410 支持VFPv2),Cortex-A8(s5pv210)支持VFPv3.

NEON和VFPv3 浮点协处理器共享寄存器组,所以在汇编时,指令是一样的。

编译选项:

-mfpu = name(neon or vfpvx)指定FPU 单元

-mfloat-abi = name(soft、hard、 softfp):指定软件浮点或硬件浮点或兼容软浮点调用接口

如果只指定 -mfpu,那么默认编译不会选择选择硬件浮点指令集

如果只指定 -mfloat-abi = hard或者softfp,那么编译会使用硬件浮点指令集

测试C文件

int main(void){float f1, f2, f3;f1 = 1.2;f2 = 1.3;f3 = f1 / f2;return 0;}

1、 arm-eabi-gcc  -S hello.c -mfpu=neon

.arch armv5te.fpu softvfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".global__aeabi_fdiv.text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0stmfdsp!, {fp, lr}.save {fp, lr}.LCFI0:.setfp fp, sp, #4addfp, sp, #4.LCFI1:.pad #16subsp, sp, #16.LCFI2:ldrr3, .L3@ floatstrr3, [fp, #-16]@ floatldrr3, .L3+4@ floatstrr3, [fp, #-12]@ floatldrr0, [fp, #-16]@ floatldrr1, [fp, #-12]@ floatbl__aeabi_fdivmovr3, r0strr3, [fp, #-8]@ floatmovr3, #0movr0, r3subsp, fp, #4ldmfdsp!, {fp, pc}.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

2、 arm-eabi-gcc  -S hello.c -mfpu=vfp

.arch armv5te.fpu softvfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".global__aeabi_fdiv.text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0stmfdsp!, {fp, lr}.save {fp, lr}.LCFI0:.setfp fp, sp, #4addfp, sp, #4.LCFI1:.pad #16subsp, sp, #16.LCFI2:ldrr3, .L3@ floatstrr3, [fp, #-16]@ floatldrr3, .L3+4@ floatstrr3, [fp, #-12]@ floatldrr0, [fp, #-16]@ floatldrr1, [fp, #-12]@ floatbl__aeabi_fdivmovr3, r0strr3, [fp, #-8]@ floatmovr3, #0movr0, r3subsp, fp, #4ldmfdsp!, {fp, pc}.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

可以看到上面两个例子,使用的是 .fpu softvfp


3、 arm-eabi-gcc  -S hello.c -mfpu=neon -mfloat-abi=hard

.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu neon.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

4、 arm-eabi-gcc  -S hello.c -mfpu=neon -mfloat-abi=softfp

.arch armv5te.eabi_attribute 27, 3.fpu neon.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

5、 arm-eabi-gcc  -S hello.c -mfpu=vfpv3 -mfloat-abi=softfp

.arch armv5te.eabi_attribute 27, 3.fpu vfpv3.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

6、 arm-eabi-gcc  -S hello.c -mfpu=vfpv3 -mfloat-abi=hard

.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu vfpv3.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

从上面可以看到,使用softfp和hard使用的指令集是一样的,都是硬件浮点, neon和vfp的区别,仅仅体现在.fpu vfpv3和.fpu neon.


7、 arm-eabi-gcc  -S hello.c -mfloat-abi=hard

.arch armv5te.eabi_attribute 27, 3.eabi_attribute 28, 1.fpu vfp.eabi_attribute 20, 1.eabi_attribute 21, 1.eabi_attribute 23, 3.eabi_attribute 24, 1.eabi_attribute 25, 1.eabi_attribute 26, 2.eabi_attribute 30, 6.eabi_attribute 18, 4.file"hello.c".text.align2.globalmain.typemain, %functionmain:.fnstart.LFB0:@ args = 0, pretend = 0, frame = 16@ frame_needed = 1, uses_anonymous_args = 0@ link register save eliminated.strfp, [sp, #-4]!.save {fp}.LCFI0:.setfp fp, sp, #0addfp, sp, #0.LCFI1:.pad #20subsp, sp, #20.LCFI2:fldss15, .L3fstss15, [fp, #-16]fldss15, .L3+4fstss15, [fp, #-12]fldss14, [fp, #-16]fldss15, [fp, #-12]fdivss15, s14, s15fstss15, [fp, #-8]movr3, #0movr0, r3addsp, fp, #0ldmfdsp!, {fp}bxlr.L4:.align2.L3:.word1067030938.word1067869798.LFE0:.fnend.sizemain, .-main.ident"GCC: (Sourcery G++ Lite 2009q3-67) 4.4.1".section.note.GNU-stack,"",%progbits

当直接使用-mfloat-abi=hard时,会默认使用.fpu vfp硬件浮点。


3 0
原创粉丝点击