深入剖析GCC函数调用堆栈变化过程

来源：互联网发布：性价比高的耳机知乎编辑：程序博客网时间：2024/06/05 11:30

大家在通过反汇编去分析gcc生成的AT&T汇编语句的时候，经常会发现在函数调用的开始总有下面的两条汇编语句：

push %ebp

movl %esp, %ebp

在函数调用结束的时候，可以看到：

leave

ret

这里笔者网上搜索了一些说明，tenfy简单翻译一下对应的英文说明，并加上自己的理解和说明。为了更好的说明函数调用的入栈过程，请参考下面的简略图说明：

下面翻译说明：

Can anybody explain me what effect these two instructions cause in the assembly code generated by gcc for x86 machines:

有人能告诉我这两条gcc编译器为x86机器生成的汇编指令有什么作用吗？

push %ebp
movl %esp, %ebp

回答一：

%ebp is the "base pointer" for your stack frame. It's the pointer used by the C runtime to access local variables and parameters on the stack. Here's some typical function prologue code generated by GCC (g++ to be precise) First the C++ source.

寄存器%ebp (tenfy注：AT&T汇编中，引用寄存器需要加上%前缀，在带有c/c++的内嵌汇编中，则需要带%%前缀) 存储函数调用过程中当前“栈帧”的基址，该基址是用于c运行库访问栈中的局部变量和参数的指针。这里有些典型的gcc（确切的说是g++）生成的函数调用中最开始执行的代码，首先看c++源码如下：

// junk.c++
int addtwo(int a)
{
int x = 2;

return a + x;
}

This generates the following assembler.

生成如下的汇编语言：

.file   "junk.c++"
.text
.globl _Z6addtwoi
.type   _Z6addtwoi, @function
_Z6addtwoi:
.LFB2:
    pushl   %ebp                    #tenfy:调用函数的栈帧指针地址先入栈，在此之前对应的实参和返回地址已经入栈了
.LCFI0:
    movl    %esp, %ebp        #tenfy:把当前的栈顶指针赋值给%ebp，使得当前调用新的栈帧指针跟%esp一致,此时%ebp和%esp指向一致。
.LCFI1:
    subl    $16, %esp             #tenfy:当前栈顶的指针减去16,使得当前的%esp往内存低地址移动16个字节，为局部变量的存储预留空间
.LCFI2:
    movl    $2, -4(%ebp)         #tenfy:把立即数2赋值给到%ebp-4的内存单元，从示意图可以看出，即从基址%ebp往内存低处移动四个字节，写入int x=2
    movl    -4(%ebp), %edx    #把x也写入寄存器%edx
    movl    8(%ebp), %eax     #tenfy:8(%ebp)即是%ebp往高地址内存移动8个byte，也就是实参1的栈地址，即把实参a的值写入寄存器%eax
    addl    %edx, %eax          #tenfy:把寄存器%edx和%eax相加后，存入%eax,注意：%eax也是gcc作为存储函数返回值的寄存器
    leave
    ret
.LFE2:
    .size   _Z6addtwoi, .-_Z6addtwoi
    .ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
    .section    .note.GNU-stack,"",@progbits

Now to explain that prologue code (all the stuff before .LCFI2:), first:

现在开始说明开头的代码（所有在.LCFI2前面的部分代码），首先

1，pushl %ebp stores the stack frame of the calling function on the stack. （调用函数的栈帧入栈）
2，movl %esp, %ebp takes the current stack pointer and uses it as the frame for the called function. （当前栈顶指针作为新的调用函数的栈帧）
3，subl $16, %esp leaves room for local variables. （为局部变量预留空间）

Now your function is ready for business. Any references with a negative offset from the %ebp% register are your local variables (x in this example). Any references with a positive offset from the %ebp% register are your parameters passed in.

现在你的函数调用一切都准备就绪了，所有相对寄存器%ebp负偏移量的引用都是局部变量（如这个例子中的x变量），所有相对寄存器%ebp正偏移量的引用都是实参。

The final point of interest is the leave instruction which is an x86 assembler instruction which does the work of restoring the calling function's stack frame. This is usually optimized away in to the faster move %ebp %esp and pop %ebp% sequence in C code. For illustrative purposes, however, I didn't compile with any optimizations on at all.

令人感兴趣的最后一个关键点是指令leave，它是一个x86的汇编指令，主要用于恢复调用函数的栈帧。该指令常常用于优化c代码中的指令序列：move %ebp %esp和pop %ebp，然而，为了更有说明性，我在编译中根本没有打开任何优化的开关。

回答二：

It's typical code that you see at the beginning of a function.

这是你在函数调用的开始地方经常可以看到的典型代码

It saves the contents of the EBP register on the stack, and then stores the content of the current stack pointer in EBP.

它先把寄存器%ebp的内容存储到堆栈（入栈操作），然后把当前堆栈指针的内容赋值给寄存器%ebp

The stack is used during a function call to store local arguments. But in the function, the stack pointer may change because values are stored on the stack.

堆栈常常在函数调用中用于存储本地参数，但在函数中，由于存储在堆栈中的值使得堆栈指针的值常常会发生变化。

If you save the original value of the stack, you can refer to the stored arguments via the EBP register, while you can still use (add values to) the stack.

假如你把原始的值入栈，你将能够通过寄存器%ebp引用对应存储的参数，甚至你仍能继续使用（增加值到堆栈）堆栈。

At the end of the function you will probably see the command

在函数的最后你将会看到如下的命令：

pop %ebp ; restore original value
ret ; return

回答三：

push %ebp

This will push the 32 bit (extended) base pointer register on the stack, i.e. the stack pointer (%esp) is subtracted by four, then the value of %ebp is copied to the location that the stack pointer points to.

这将会导致把32位的基址寄存器压入堆栈，也就是说，把当前堆栈指针（存储在寄存器%esp）减去4，然后把寄存器%ebp的内容拷贝到对应的位置（即%esp对应的堆栈位置）

movl %esp, %ebp

This copies the stack pointer register to the base pointer register.

这将把当前堆栈指针的寄存器%esp拷贝到基址寄存器%ebp

The purpose of copying the stack pointer to the base pointer is to create a stack frame, i.e. an area on the stack where a subroutine can store local data. The code in the subroutine would use the base pointer to reference the data.

这拷贝的目的主要是创建一个“栈帧”，也就是说，在堆栈中开辟一块区域，该区域给后续的调用子历程存储本地数据，子历程的代码能够使用这里的“基址”访问到相关的数据。

It's part of what is known as the function prolog.

It saves the current base pointer that is going to be retrieved when the function ends and sets the new ebp to the beginning of the new frame