Linux动态链接库函数加载之GOT与Lazy Binding

来源:互联网 发布:csgo淘宝买枪 编辑:程序博客网 时间:2024/06/04 18:35

Linux在运行与位置无关PIC(Position IndependentCode)的用户态程序并加载动态链接库时,函数符号的解析过程将涉及到全局偏移量表GOT(Global Offset Table)Lazy Binding("懒绑定")

 

在位置无关代码PIC中一般不能包含动态链接库中符号的绝对地址。当运行某个调用动态库函数符号的用户态程序时,用户态程序在编译链接阶段并不知晓该符号的具体位置,只有等到运行阶段,动态加载器将所需要的共享库加载到内存后,才最终确定符号的地址。而在编译阶段所有与位置无关的函数调用都将被保存到ELF文件的过程链接表PLT(Procedure Linkage Table)中。

 

现在通过GDB调试演示Glibc动态链接库中"syscall"函数的动态解析过程。源代码如下:

     1  /* syscall_test.cpp-- GOT and Lazy Binding mechanism debugging */     …    16  /* Sandbox needs to be configured with policy by init_sandbox.    17   * New syscall 400 are reserved to config policy, then switch filter on!    18   * but funcode=20 is specially reserved to directly ret -1 for debugging.    19   */    20      21  int init_sandbox()    22  {    …       /*policy configuration details ignored*/    51  }    52      53  int main()    54  {    55      init_sandbox();    56      // here ignore the ret value check for init_sandbox    57      // try to see if (400,20) return -1 or not    58      int x = syscall(400,20);    59      cout << "syscall(400,20)" << x << endl;    …    65      return 0;    66  }

首先,反汇编包用户态程序,基本工具为readelf和objdump:

在编译阶段生成的ELF文件中,用于获取实际syscall地址的syscall@plt函数代码只有寥寥3条指令,事实上过程链接表PLT中的每个子函数都只有3条指令。

 

现在来分析syscall@plt:第1条指令”jmpq *0x20228a(%rip)”会跳转至GOT表中的某个位置。如果此时syscall是第一次被调用,那么GOT表中相应位置保存的是0x00400e06,即第2条指令的位置;如果syccall函数地址之前已经被解析完成,那么GOT表中相应位置保存的便是函数的实际地址,此时syscall@plt的第2,3条指令便不会被执行。这种机制就是所谓的Lazy Binding,即当且仅当动态库中函数符号第一次被调用时才去解析函数的实际位置,然后交由相应过程链接表PLT中的第2,3条指令完成函数实际地址的解析工作。也就是说,动态链接库中的函数符号只有在第一次被调用时,才会被定位绑定到其函数实际地址,这也是术语Lazy Binding名字的来源。

 

然后,继续分析syscall@plt的第2条指令”pushq  $0xf”,将立即数$0xf压栈,该立即数的含义为函数名”syscall”在重定向段”.rela.plt”中的偏移量下标。运行时地址重定向的相关数据结构定义如下:


/* Relocation table entry with addend (in section of type SHT_RELA).  */typedef struct{  Elf64_Addr        r_offset;          /* Address */  Elf64_Xword r_info;        /* Relocation type and symbol index */  Elf64_Sxwordr_addend;/* Addend */} Elf64_Rela;

静态反汇编之后的重定向符号段.rela.plt如下所示:

[elvis@localhost CentOS7]$ readelf --relocs syscall_testRelocation section '.rela.dyn' at offset 0xa08 contains 2 entries:  Offset          Info           Type           Sym. Value    Sym. Name + Addend000000602ff8  000500000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0000000603100  002200000005 R_X86_64_COPY     0000000000603100 _ZSt4cout + 0Relocation section '.rela.plt' at offset 0xa38 contains 28 entries:  Offset          Info           Type           Sym. Value    Sym. Name + Addend000000603018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSs6appendEPKcm + 0000000603020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSsC1Ev + 0000000603028  000300000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSolsEi + 0000000603030  000400000007 R_X86_64_JUMP_SLO 0000000000000000 snprintf + 0000000603038  000500000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0000000603040  000700000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNKSs5c_strEv + 0000000603048  000800000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSs7reserveEm + 0000000603050  000900000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNKSs4sizeEv + 0000000603058  000a00000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSt8ios_base4InitC1E+0000000603060  000b00000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0000000603068  000c00000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSsC1ERKSs + 0000000603070  000d00000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_atexit + 0000000603078  002100000007 R_X86_64_JUMP_SLO 0000000000400dd0 _ZNSt8ios_base4InitD1E+0000000603080  000f00000007 R_X86_64_JUMP_SLO 0000000000000000 _ZStlsISt11char_traits+0000000603088  001000000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSsD1Ev + 0000000603090  001200000007 R_X86_64_JUMP_SLO 0000000000000000 syscall + 0000000603098  001300000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSsC1EPKcRKSaIcE + 00000006030a0  001400000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSaIcED1Ev + 00000006030a8  001600000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSs6appendEPKc + 00000006030b0  001700000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSolsEPFRSoS_E + 00000006030b8  002000000007 R_X86_64_JUMP_SLO 0000000000400e50 _ZSt4endlIcSt11char_tr+00000006030c0  001800000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSaIcEC1Ev + 00000006030c8  002300000007 R_X86_64_JUMP_SLO 0000000000400e70 __gxx_personality_v0 + 00000006030d0  001900000007 R_X86_64_JUMP_SLO 0000000000000000 localtime + 00000006030d8  001a00000007 R_X86_64_JUMP_SLO 0000000000000000 _Unwind_Resume + 00000006030e0  001b00000007 R_X86_64_JUMP_SLO 0000000000000000 _ZNSs6appendERKSs + 00000006030e8  001c00000007 R_X86_64_JUMP_SLO 0000000000000000 open + 00000006030f0  001d00000007 R_X86_64_JUMP_SLO 0000000000000000 time + 0

最后,分析syscall@plt的第3条指令,其将跳转至PLT表的第一个子函数继续执行:将GOT[1]压栈,然后跳转至GOT[2]保存的函数地址继续执行。至此,代码执行流完成了从过程链接表PLT表到全局偏移量表GOT表的跳转。

 

关于全局偏移量表GOT,其相关定义如下:

/* The GOT entries for functions in the PLT have not yet been filled * in.  Their initial contents will arrange when called to push an * offset into the .rel.plt section, push _GLOBAL_OFFSET_TABLE_[1], * and then jump to _GLOBAL_OFFSET_TABLE_[2].  */ * got = (Elf64_Addr *) D_PTR (l, l_info[DT_PLTGOT]); * If a library is prelinked but we have to relocate anyway, * we have to be able to undo the prelinking of .got.plt. * The prelinker saved us here address of .plt + 0x16. *  * The got[2] entry contains the address of a function which gets * called to get the address of a so far unresolved function and * jump to it.  The profiling extension of the dynamic linker allows * to intercept the calls to collect information.  In this case we * don't store the address in the GOT so that all future calls also * end in this function.  */Elf64_Addr *got;

即:GOT表中每一项都是64bit的Elf64_Addr地址,但其中GOT表前三项用于保存特殊的数据结构地址:

  • GOT[0]为段”.dynamic”的加载地址。
  • GOT[1]为ELF所依赖的动态链接库链表头struct link_map结构体描述符地址。
  • GOT[2]为_dl_runtime_resolve函数地址,即GOT[2] =&_dl_runtime_resolve,该函数的作用是遍历GOT[1]指向的动态链接库链表直至找到某个符号的地址,然后将该符号地址保存至相应的GOT表项中。

通过GDB调试动态链接库符号解析的过程详情如下:


紧接着上图反汇编代码(由于上图里面有个箭头需要以截图表示)  

   0x00007ffff7df0263 <+83>:  nopw   0x30(%rsp)   0x00007ffff7df0269 <+89>:  nopw   0x20(%rsp)   0x00007ffff7df026f <+95>:  nopw   0x10(%rsp)   0x00007ffff7df0275 <+101>: nopw   (%rsp)   0x00007ffff7df027a <+106>: mov    0x70(%rsp),%r9   0x00007ffff7df027f <+111>: mov    0x68(%rsp),%r8   0x00007ffff7df0284 <+116>: mov    0x60(%rsp),%rdi   0x00007ffff7df0289 <+121>: mov    0x58(%rsp),%rsi   0x00007ffff7df028e <+126>: mov    0x50(%rsp),%rdx   0x00007ffff7df0293 <+131>: mov    0x48(%rsp),%rcx   0x00007ffff7df0298 <+136>: mov    0x40(%rsp),%rax   0x00007ffff7df029d <+141>: add    $0x88,%rsp   0x00007ffff7df02a4 <+148>: jmpq   *%r11End of assembler dump.(gdb) niBreakpoint 4, 0x0000000000401898 in main () at syscall_test.cpp:5858    int x = syscall(324,20);3: $rip = (void (*)(void)) 0x401898 <main()+28>2: $eip = void1: $rax = 0(gdb) disass mainDump of assembler code for function main():   0x000000000040187c <+0>:push   %rbp   0x000000000040187d <+1>:mov    %rsp,%rbp   0x0000000000401880 <+4>:sub    $0x10,%rsp   0x0000000000401884 <+8>:callq  0x400fda <init_sandbox()>   0x0000000000401889 <+13>:mov    $0x14,%esi   0x000000000040188e <+18>:mov    $0x190,%edi   0x0000000000401893 <+23>:mov    $0x0,%eax=> 0x0000000000401898 <+28>:callq  0x400e00 <syscall@plt>   0x000000000040189d <+33>:mov    %eax,-0x8(%rbp)(gdb) info registersrax            0x00...rip            0x4018980x401898 <main()+28>eflags         0x202[ IF ]...(gdb) si0x0000000000400e00 in syscall@plt ()(gdb) disass $ripDump of assembler code for function syscall@plt:=> 0x0000000000400e00 <+0>: jmpq  *0x20228a(%rip) #0x603090 <syscall@got.plt>   0x0000000000400e06 <+6>: pushq $0xf   0x0000000000400e0b <+11>:jmpq  0x400d00End of assembler dump.(gdb) bt#0  0x00007ffff72ea9a1 in syscall () from /lib64/libc.so.6#1  0x000000000040189d in main () at syscall_test.cpp:58(gdb) disass $ripDump of assembler code for function syscall:   0x00007ffff72ea980 <+0>:mov   %rdi,%rax   0x00007ffff72ea983 <+3>:mov   %rsi,%rdi   0x00007ffff72ea986 <+6>:mov   %rdx,%rsi   0x00007ffff72ea989 <+9>:mov   %rcx,%rdx   0x00007ffff72ea98c <+12>:mov   %r8,%r10   0x00007ffff72ea98f <+15>:mov   %r9,%r8   0x00007ffff72ea992 <+18>:mov   0x8(%rsp),%r9   0x00007ffff72ea997 <+23>:syscall    0x00007ffff72ea999 <+25>:cmp   $0xfffffffffffff001,%rax   0x00007ffff72ea99f <+31>:jae   0x7ffff72ea9a2 <syscall+34>=> 0x00007ffff72ea9a1 <+33>:retq      0x00007ffff72ea9a2 <+34>:mov   0x2c94bf(%rip),%rcx # 0x7ffff75b3e68   0x00007ffff72ea9a9 <+41>:neg   %eax   0x00007ffff72ea9ab <+43>:mov   %eax,%fs:(%rcx)   0x00007ffff72ea9ae <+46>:or    $0xffffffffffffffff,%rax   0x00007ffff72ea9b2 <+50>:retq   End of assembler dump.(gdb) info registersrax            0xfffffffd4294967293...rip            0x7ffff72ea9a10x7ffff72ea9a1 <syscall+33>eflags         0x207[ CF PF IF ]...

至此,可以明晰Linux中的程序在使用Glibc所封装的系统调用过程中涉及的懒绑定技术原理。

原创粉丝点击