pin的使用

来源：互联网发布：在线翻译软件有哪些编辑：程序博客网时间：2024/06/05 03:47

我们最好把pin当做一个JIT及时的编译器，这个编译器的输入不是字节码，而是正常的可执行变量，Pin介入这个可执行变量的第一条指令的执行，并且在这个指令的位置为直线代码生成新的代码。然后它将控制交给生成的指令顺序。生成的代码顺序与原始的代码顺序几乎相同，但是pin确保当一个分支退出这个执行顺序的时候，pin又恢复对指令顺序的控制。在重新获得指令书序控制之后，pin为新的分支目标产生更多的代码并且继续执行。pin通过保证所有生成的代码都在内存中，这样从一个指令执行顺序分支到另一个的时候就能够直接重用生成的代码。

在JIT模式下，唯一执行的代码就是生成的代码，原始代码只是作为参考。当生成代码时，PIN给用户机会来插桩自己的代码。

pin对所有实际执行的指令进行插桩，无论指令时属于哪个部分。尽管对于条件分支指令有一些例外情况，一般来说，如果一个指令不会被执行，那么它也不会被插桩。

Pintools

概念上来说，插桩包括两个部件

1. 一个机制来决定在哪里插桩，和要插桩哪个代码。

2. 在插桩点要执行的代码。

这两个部件被称为插桩部件（instrumentation）和分析代码（analysis code）部件。这两个部件都在一个可执行文件中，也就是Pintool。Pintools可以被认为是能够在pin内部修改代码生成的一个插件。

pintool寄存器向pin注册插桩回调函数。每当需要生成新代码的时候pin都会调用这些回调函数。这个插桩回调函数就代表了插桩部件（instrumentation）。这些回调函数检查要生成的嗲吗，研究它的静态特性，并且确定是否，以及在哪里来插桩对分析函数analysis function的调用。

分析函数收集应用的信息，pin确保整数寄存器状态及时保存起来，并且在必要时恢复。同时允许向Fenix函数传递参数。要注意的是浮点寄存器没有被保存和恢复。所以在analysis routines中还需要额外的支持。See Floating Point Support in Analysis Routines for more information.

pintool还能够注册通知回调函数，这些回调函数是在发生类似于线程创建或者fork这样的事件时被调用。这些回调函数通常被用于手机数据或者工具的初始化或者最后清理工作。

Observations

由于pintool的工作过程类似于一个插件，它必须和pin，以及要插桩的可执行文件在同一个地址空间运行。这样pintool就能够访问执行变量所有的数据。pintool同时也和可执行变量共享文件描述符以及其他进程信息。

pin和pintool对一个程序的控制从第一条指令就开始了。对于与共享库一同编译的可执行文件来说，这就意味着动态加载器以及所有共享库的行为对于pintool都是可见的。

当编写tools时，调试analysis代码比调试插桩代码更重要，这是因为插桩只会被执行一次，而analysis code会被执行很多次。

Instrumentation Granularity

As described above, Pin's instrumentation is "just in time" (JIT). Instrumentation occurs immediately before a code sequence is executed for the first time. We call this mode of operation trace instrumentation .

正如上面所描述的一样，pin的插桩是just in time的。插桩就在代码第一次执行之前发生。我们把这种操作模式叫做trace插桩。

Trace instrumentation lets the Pintool inspect and instrument an executable one trace at a time. Traces usually begin at the target of a taken branch and end with an unconditional branch, including calls and returns. Pin guarantees that a trace is only entered at the top, but it may contain multiple exits. If a branch joins the middle of a trace, Pin constructs a new trace that begins with the branch target. Pin breaks the trace into basic blocks, BBLs. A BBL is a single entrance, single exit sequence of instructions. Branches to the middle of a bbl begin a new trace and hence a new BBL. It is often possible to insert a single analysis call for a BBL, instead of one analysis call for every instruction. Reducing the number of analysis calls makes instrumentation more efficient. Trace instrumentation utilizes the TRACE_AddInstrumentFunction API call.

Note, though, that since Pin is discovering the control flow of the program dynamically as it executes, Pin's BBL can be different from the classical definition of a BBL which you will find in a compiler textbook. For instance, consider the code generated for the body of a switch statement like this

0 0