Erlang HIPE/x86尾调用优化

来源：互联网发布：license破解软件编辑：程序博客网时间：2024/05/29 03:02

摘自：《The HiPE/x86 Erlang Compiler: System Description and Performance Evaluation》

Tailcall optimisation is implemented by having the caller overwrite its incoming
parameter area on the stack with the new parameters for the callee, deallocate the remaining
portion of its stack frame, and invoke the callee with a jmp instruction. When
a function returns, it is also responsible for removing its parameters from the stack; to
this end it uses the x86 ret $n instruction, which pops the return address, adds n to
%esp, and then jumps to the return address.
Parameter passing is straightforward: the first 3 parameters are passed in registers,
and the remaining ones are pushed on the stack in left-to-right order. A function puts its
return value in the %eax register.

Illustrating calls and tailcalls on x86. To illustrate how recursive calls and tailcalls are
implemented by HiPE/x86, assume that f recursively calls g, g tailcalls h, and h finally
returns to f. Figure above shows the stack layout changes in this process. State (a) shows the
stack before f calls g. f evaluates the parameters to g, pushes them on the stack, and
executes a call instruction which pushes a return address and jumps to g, leading to
state (b). g allocates its frame (the dashed portion), evaluates the parameters to h, and
shuffles the stack to overwrite the argument area and possibly parts of its frame with
the new parameters, leading to state (c). Then g completes its tailcall to h by dropping
its frame and jumping to h, leading to state (d). In state (d), the stack is exactly as if f
had called h directly. Eventually h drops its frame, and executes a ret $n instruction,
which pops the return address and stacked parameters and returns to f, leading to state
(e). Register parameters are omitted from this description.
Figure above also illustrates why it is the callee which must deallocate the stacked parameters.
In the presence of tailcalls, the caller (f above) does not know which function
finally returns to it, and it does not know how many parameters there currently are on
the stack. Therefore, the caller cannot deallocate the stacked parameters, but the returning
function can since it knows howmany parameters it takes.We point this out because
this is the opposite of the calling convention normally used by C and Unix on x86.
A disadvantage of this calling convention is that the stack shuffle step during tailcalls
must also consider the return address as a stacked parameter that will have to be
moved to a different stack slot if the caller and callee (g and h above) have different
numbers of stacked parameters. Passing parameters in registers reduces the number of
calls with stacked parameters, alleviating this problem.