Load-time relocation of shared libraries

来源:互联网 发布:厚积落叶听雨声知乎 编辑:程序博客网 时间:2024/04/29 06:16

一篇关于装载时重定位技术的文章,翻译的不好,还望批评指正!

免费PDF文档下载地址:http://ishare.iask.sina.com.cn/f/35236483.html 

或者: http://wenku.baidu.com/view/d67a3108a6c30c2259019e6a.html

 

 

This article’s aim is to explain how a modern operating system makes it possible to use shared libraries with load-time relocation. It focuses on the Linux OS running on 32-bit x86, but the general principles apply to other OSes and CPUs as well.

 这篇文章的目的是描述现代操作系统如何利用“装载时重定位(load-time relocation)”技术使用共享库的。实验平台是32位的linux操作系统 —— 当然这些原理也适用于其他的操作系统与CPU架构。

 

Note that shared libraries have many names – shared libraries, shared objects, dynamic shared objects (DSOs), dynamically linked libraries (DLLs – if you’re coming from a Windows background). For the sake of consistency, I will try to just use the name "shared library" throughout this article.

 注意:共享库还有其他很多的称呼 —— 共享库(shared libraries),共享对象(shared objects),动态共享对象(DSOs),动态共享库(DLLs — windows操作系统)。为了上下文的一致性,本文中使用"shared library"来表示共享库。

 

Loading executables

Linux, similarly to other OSes with virtual memory support, loads executables to a fixed memory address. If we examine the ELF header of some random executable, we’ll see anEntry point address:

Linux操作系统和其他的操作系统一样,都支持虚拟内存,都加载可执行文件到固定的虚拟内存位置(linux — 0x08040000, windows — 0x0040000)。如果我们查看可执行文件的ELF文件头,就会注意到入口地址(Entry point

$ readelf -h /usr/bin/uptimeELF Header:  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00  Class:                             ELF32  [...] some header fields  Entry point address:               0x8048470  [...] some header fields

 

链接器设置入口地址来告诉操作系统从哪里开始执行程序。 如果用GDB来调试程序的话,就会发现0x8048470是程序文本段(.text segment)的第一条指令的地址。

 

What this means is that the linker, when linking the executable, can fully resolve allinternal symbol references (to functions and data) to fixed and final locations. The linker does some relocations of its own[2], but eventually the output it produces contains no additional relocations.

意思就是说,当链接器链接可执行文件的时候,可以完全解析内部的符号引用(这些符号或者引用函数,或者引用数据),进行相应的重定位操作,并且确定最终的内存位置,一旦确定就不容更改。 当然,链接器程序自身也会为自己进行一些必要地重定位操作,只不过这些信息不会输出。

 

Or does it? Note that I emphasized the word internal in the previous paragraph. As long as the executable needs no shared libraries[3], it needs no relocations. But if itdoes use shared libraries (as do the vast majority of Linux applications), symbols taken from these shared libraries need to be relocated, because of how shared libraries are loaded.

这是真的吗?请注意,上文中我特地强调是内部符号。 我们知道,如果可执行文件运行时不需要用到共享库,那么自然地就没有重定位操作,但是如果要用到共享库的话(就像绝大多数的Linux应用程序一样),就会有重定位操作 —— 装载时重定位。因为共享库是在程序运行时装载入内存的,所以程序对共享库内的符号的引用只有在装载时才可以解析与重定位。

 

Loading shared libraries

Unlike executables, when shared libraries are being built, the linker can’t assume a known load address for their code. The reason for this is simple. Each program can use any number of shared libraries, and there’s simply no way to know in advance where any given shared library will be loaded in the process’s virtual memory. Many solutions were invented for this problem over the years, but in this article I will just focus on the ones currently used by Linux.

与可执行文件不同的是,共享对象在编译时不能假设自己在进程虚拟地址空间中的位置。原因很简单:我们可以想象一个程序运行时也许会链接很多共享库,所以提前知道一个共享库在虚拟地址空间中的位置几乎是不可能的。 这些年相应的有很多机制发明出来试图解决这个问题,但是在本文中,我们只关注装载时重定位这个机制。

 

But first, let’s briefly examine the problem. Here’s some sample C code [4] which I compile into a shared library:

首先,让我们来一起看看这个问题。 以下是一段简单的C程序,我将它编译成共享库:

int myglob = 42;int ml_func(int a, int b){    myglob += a;    return b + myglob;}

 

Note how ml_func references myglob a few times. When translated to x86 assembly, this will involve a mov instruction to pull the value of myglob from its location in memory into a register.mov requires an absolute address – so how does the linker know which address to place in it? The answer is – it doesn’t. As I mentioned above, shared libraries have no pre-defined load address – it will be decided at runtime.

可以看到函数ml_func引用了全局变量myglob好几次。对应x86汇编的话,会被翻译成一条mov指令 —— 从myglob变量的内存地址处取出其值,并放到一个寄存器中。 我们知道,mov指令需要一个绝对地址,所以链接器如何知道这个绝对地址呢?答案是链接器不知道,正如上文所说的,共享库编译时无法提前知道自己在进程虚拟地址空间中的位置,只有到程序运行时,共享库加载进内存后才能确定这些地址。

 

In Linux, the dynamic loader [5] is a piece of code responsible for preparing programs for running. One of its tasks is to load shared libraries from disk into memory, when the running executable requests them. When a shared library is loaded into memory, it is then adjusted for its newly determined load location. It is the job of the dynamic loader to solve the problem presented in the previous paragraph.

Linux操作系统中,动态链接器(dynamic loader or dynamic linker)负责完成所有的动态链接工作以后再把控制权交给程序,然后程序开始执行。 它的任务之一是当程序运行时需要共享库的时候,将共享库从磁盘加载进内存中。 当共享库加载进内存后,动态链接器就会根据实际加载的地址来重定位相应的符号引用。 这就是动态链接器的主要工作内容。

 

There are two main approaches to solve this problem in Linux ELF shared libraries:

  1. Load-time relocation
  2. Position independent code (PIC)

Linux ELF共享库中,主要有两个途径可以解决这个问题:

1.装载时重定位

2.地址无关代码(PIC

 

Although PIC is the more common and nowadays-recommended solution, in this article I will focus on load-time relocation. Eventually I plan to cover both approaches and write a separate article on PIC, and I think starting with load-time relocation will make PIC easier to explain later. (Update 03.11.2011: the article about PIC was published)

尽管地址无关代码(PIC)是现在最常用及推荐的方法,不过本文中我只会介绍装载时重定位。 因为我觉得先弄明白“装载时重定位”是怎么回事,那么理解什么是“地址无关代码”就很容易了。

 

Linking the shared library for load-time relocation

To create a shared library that has to be relocated at load-time, I’ll compile it without the-fPIC flag (which would otherwise trigger PIC generation):

要创建一个加载时可以重定位的共享库,可以在编译时不加-fPIC选项(这个选项会生成地址无关代码):

gcc -g -c ml_main.c -o ml_mainreloc.ogcc -shared -o libmlreloc.so ml_mainreloc.o

 

The first interesting thing to see is the entry point oflibmlreloc.so:

第一件有意思的事是libmlreloc.so的入口地址:

$ readelf -h libmlreloc.soELF Header:  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00  Class:                             ELF32  [...] some header fields  Entry point address:               0x3b0  [...] some header fields


For simplicity, the linker just links the shared object for address 0x0 (the .text section starting at 0x3b0), knowing that the loader will move it anyway. Keep this fact in mind – it will be useful later in the article.

可以看到,为简单起见,链接器假设共享库的加载地址为0x0(文本段.text0x3b0开始),因为它知道加载器最终会忽略这个地址,而将共享库加载入实际的内存。先记着这个结论,它在后文中将会很有用。

 

Now let’s look at the disassembly of the shared library, focusing onml_func:

现在让我们来看看共享库的反汇编代码,主要看函数ml_func的反汇编:

$ objdump -d -Mintel libmlreloc.solibmlreloc.so:     file format elf32-i386[...] skipping stuff0000046c <ml_func>: 46c: 55                      push   ebp 46d: 89 e5                   mov    ebp,esp 46f: a1 00 00 00 00          mov    eax,ds:0x0 474: 03 45 08                add    eax,DWORD PTR [ebp+0x8] 477: a3 00 00 00 00          mov    ds:0x0,eax 47c: a1 00 00 00 00          mov    eax,ds:0x0 481: 03 45 0c                add    eax,DWORD PTR [ebp+0xc] 484: 5d                      pop    ebp 485: c3                      ret[...] skipping stuff


 After the first two instructions which are part of the prologue [6], we see the compiled version of myglob += a[7]. The value ofmyglob is taken from memory intoeax, incremented bya (which is atebp+0x8) and then placed back into memory.

在最开始的两行指令之后,我们可以看到myglob += a语句对应的汇编代码。 变量myglob的值被传入寄存器eax中,再加上变量a的值(这个值存储在ebp+0x8地指处),最后将结果返回给变量myglob

 

But wait, the mov takes myglob? Why? It appears that the actual operand of mov is just 0x0 [8]. What gives? This is how relocations work. The linker places some provisional pre-defined value (0x0 in this case) into the instruction stream, and then creates a special relocation entry pointing to this place. Let’s examine the relocation entries for this shared library:

但是请注意,mov指令真的是从变量myglob内存地址处取值吗? 从上面的汇编代码看,很显然mov指令的操作数只是0x0。 难道变量myglob的地址为0x0? 这是怎么回事呢? —— 这就是重定位的工作方式。 链接器一般会在指令中放入一些临时的预定义的值(比如这里的0x0),然后生成特定的重定位入口(relocation entry)指向这个地方。 让我们来看看这个共享库的重定位入口:

$ readelf -r libmlreloc.soRelocation section '.rel.dyn' at offset 0x2fc contains 7 entries: Offset     Info    Type            Sym.Value  Sym. Name00002008  00000008 R_386_RELATIVE00000470  00000401 R_386_32          0000200C   myglob00000478  00000401 R_386_32          0000200C   myglob0000047d  00000401 R_386_32          0000200C   myglob[...] skipping stuff


The rel.dyn section of ELF is reserved for dynamic (load-time) relocations, to be consumed by the dynamic loader. There are 3 relocation entries formyglob in the

section showed above, since there are 3 references tomyglob in the disassembly. Let’s decipher the first one.

ELF中的rel.dyn段被保留用来加载时重定位用的,也就是说动态链接器会用到该段。 可以看到,有3个重定位入口都是和变量myglob相关的,这是因为汇编代码中有3处引用了变量myglob。我们先看第一个关于myglob的重定位入口。

 

It says: go to offset 0×470 in this object (shared library), and apply relocation of typeR_386_32 to it for symbolmyglob. If we consult the ELF spec we see that relocation typeR_386_32 means: take the value at the offset specified in the entry, add the address of the symbol to it, and place it back into the offset.

可以看到:在共享库偏移0x470的地方,有个关于变量myglob的引用,并且对其重定位的类型是R_386_32类型。 如果我们参阅ELF格式说明书的话,就会了解到重定位类型R_386_32的意思是:取出重定位入口偏移处的值,在加上符号的实际地址,结果再存入偏移处。

 

What do we have at offset 0x470 in the object? Recall this instruction from the disassembly ofml_func:

那么,在偏移0x470处是什么值呢?让我们重新看看函数ml_func的反汇编:

46f:  a1 00 00 00 00          mov    eax,ds:0x0


a1 encodes the mov instruction, so its operand starts at the next address which is 0x470. This is the 0x0 we see in the disassembly. So back to the relocation entry, we now see it says:add the address ofmyglob to the operand of thatmov instruction. In other words it tells the dynamic loader – once you perform actual address assignment, put the real address ofmyglob into0x470, thus replacing the operand ofmov by the correct symbol value. Neat, huh?

a1mov的指令码,所以它的操作数开始于下一个地址,也就是0x470处。 这里我们看到偏移0x470处的值是0x0。回头再看下重定位入口,我们现在明白它在说什么了:就是将变量myglob的实际地址和mov指令的操作数相加,结果就是mov指令的真实操作数。换句话说,它想告诉动态链接器:一旦动态链接器重定位该处对变量myglob的引用时,那么就将变量myglob的实际地址放在偏移0x470处,那么mov指令的操作数就变成变量myglob的真实地址了。很灵活,是不是呢?

 

Note also the "Sym. value" column in the relocation section, which contains0x200C formyglob. This is the offset ofmyglob in the virtual memory image of the shared library (which, recall, the linker assumes is just loaded at0x0). This value can also be examined by looking at the symbol table of the library, for example withnm:

请同样注意"Sym. value"这一列,可以看到myglobSym.value的值是0x200C,这是变量myglob在共享库中的偏移量(回想下,链接器假设共享库的加载地址是0x0)。同样的,这个值我们在符号表中依然可以看见,我们可以用nm查看符号表:

$ nm libmlreloc.so[...] skipping stuff0000200c D myglob


 This output also provides the offset of myglob inside the library.D means the symbol is in the initialized data section (.data).

同样的,这个输出给出了变量myglob在共享库中的偏移量。D说明这个符号是存储在数据段.data的。

 

Load-time relocation in action

To see the load-time relocation in action, I will use our shared library from a simple driver executable. When running this executable, the OS will load the shared library and relocate it appropriately.

为了看到装载时重定位是如何做的,我将会通过一个简单的程序来使用这个共享库。当这个程序运行时,操作系统会装载这个共享库并且做适当的重定位。

 

Curiously, due to the address space layout randomization feature which is enabled in Linux, relocation is relatively difficult to follow, because every time I run the executable, thelibmlreloc.so shared library gets placed in a different virtual memory address[9].

不过因为Linux操作系统允许地址空间布局随机化机制,所以重定位变得难以追踪,这是因为每一次运行程序时,libmlreloc.so共享库会被加载进不同的虚拟地址空间中。

 

This is a rather weak deterrent, however. There is a way to make sense in it all. But first, let’s talk about the segments our shared library consists of:

然而,这种机制是较弱的防骇机制。有办法可以知道它到底在干什么。但首先,先让我们看看我们的共享库的segment是如何组成的:

Elf file type is DYN (Shared object file)Entry point 0x3b0There are 6 program headers, starting at offset 52Program Headers:  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align  LOAD           0x000000 0x00000000 0x00000000 0x004e8 0x004e8 R E 0x1000  LOAD           0x000f04 0x00001f04 0x00001f04 0x0010c 0x00114 RW  0x1000  DYNAMIC        0x000f18 0x00001f18 0x00001f18 0x000d0 0x000d0 RW  0x4  NOTE           0x0000f4 0x000000f4 0x000000f4 0x00024 0x00024 R   0x4  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4  GNU_RELRO      0x000f04 0x00001f04 0x00001f04 0x000fc 0x000fc R   0x1 Section to Segment mapping:  Segment Sections...   00     .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .eh_frame   01     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss   02     .dynamic   03     .note.gnu.build-id   04   05     .ctors .dtors .jcr .dynamic .got


To follow the myglob symbol, we’re interested in the second segment listed here. Note a couple of things:

对于追踪变量myglob,我们这里应该关注第二个segment。注意以下这些:

  • In the section to segment mapping in the bottom, segment 01 is said to contain the.data section, which is the home ofmyglob

  从底下的section to segment mapping,我们可以看到第1segment包含数据段.data,而我们的变量myglob就在数据段中。

  • The VirtAddr column specifies that the second segment starts at0x1f04 and has size0x10c, meaning that it extends until0x2010 and thus containsmyglob which is at0x200C.

  VirtAddr列可以看到第2segment开始于0x1f04,并且大小为0x10c字节, 意思就是说这个segment会一直延伸到0x2010地址处,而包含的变量myglob的地址为0x200C

 

Now let’s use a nice tool Linux gives us to examine the load-time linking process – thedl_iterate_phdr function, which allows an application to inquire at runtime which shared libraries it has loaded, and more importantly – take a peek at their program headers.

现在让我们利用Linux提供的有利工具 —— dl_iterate_phdr函数,来一起看看装载时链接的过程吧。dl_iterate_phdr函数允许一个程序在运行时查看自己用到了哪些共享库,并且最有用的是 —— 可以查看他们的程序头。

 

So I’m going to write the following code into driver.c:

#define _GNU_SOURCE#include <link.h>#include <stdlib.h>#include <stdio.h>static int header_handler(struct dl_phdr_info* info, size_t size, void* data){    printf("name=%s (%d segments) address=%p\n",            info->dlpi_name, info->dlpi_phnum, (void*)info->dlpi_addr);    for (int j = 0; j < info->dlpi_phnum; j++) {         printf("\t\t header %2d: address=%10p\n", j,             (void*) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr));         printf("\t\t\t type=%u, flags=0x%X\n",                 info->dlpi_phdr[j].p_type, info->dlpi_phdr[j].p_flags);    }    printf("\n");    return 0;}extern int ml_func(int, int);int main(int argc, const char* argv[]){    dl_iterate_phdr(header_handler, NULL);    int t = ml_func(argc, argc);    return t;}


header_handler implements the callback for dl_iterate_phdr. It will get called for all libraries and report their names and load addresses, along with all their segments. It also invokesml_func, which is taken from thelibmlreloc.so shared library.

函数header_handler实现了dl_iterate_phdr所需的callback。该程序遍历所有的共享库,并且输出共享库的名字,输出共享库加载地址以及每个段的加载地址。当然,因为该程序调用了ml_func函数,所以该程序运行时会用到共享库libmlreloc.so

 

To compile and link this driver with our shared library, run:

编译该程序,并且将它与我们的共享库一起链接:

gcc -g -c driver.c -o driver.ogcc -o driver driver.o -L. -lmreloc


Running the driver stand-alone we get the information, but for each run the addresses are different. So what I’m going to do is run it undergdb[10], see what it says, and then usegdb to further query the process’s memory space:

运行driver程序,我们可以得到相应的输出,但是每一次的输出都不一样,所以我打算在gdb中运行该程序,并且利用gdb查看程序的地址空间:

$ gdb -q driver Reading symbols from driver...done. (gdb) b driver.c:31 Breakpoint 1 at 0x804869e: file driver.c, line 31. (gdb) r Starting program: driver [...] skipping output name=./libmlreloc.so (6 segments) address=0x12e000                header  0: address=  0x12e000                        type=1, flags=0x5                header  1: address=  0x12ff04                        type=1, flags=0x6                header  2: address=  0x12ff18                        type=2, flags=0x6                header  3: address=  0x12e0f4                        type=4, flags=0x4                header  4: address=  0x12e000                        type=1685382481, flags=0x6                header  5: address=  0x12ff04                        type=1685382482, flags=0x4[...] skipping output Breakpoint 1, main (argc=1, argv=0xbffff3d4) at driver.c:31 31    } (gdb)


Since driver reports all the libraries it loads (even implicitly, likelibc or the dynamic loader itself), the output is lengthy and I will just focus on the report aboutlibmlreloc.so. Note that the 6 segments are the same segments reported byreadelf, but this time relocated into their final memory locations.

driver程序会输出所有加载进内存的共享库(甚至一些隐含的载入,譬如libc或者动态链接器本身)的信息,有些冗长,不过我们只关心关于libmlreloc.so的输出。我们可以清楚的看到输出的6segmentreadelf输出的一模一样,只不过这次输出的是经过重定位之后的segment

 

Let’s do some math. The output says libmlreloc.so was placed in virtual address0x12e000. We’re interested in the second segment, which as we’ve seen inreadelf is at ofset 0x1f04. Indeed, we see in the output it was loaded to address0x12ff04. And sincemyglob is at offset0x200c in the file, we’d expect it to now be at address0x13000c.

让我们来做一些计算。从输出中可以看到共享库libmlreloc.so被加载到虚拟内存0x12e000处,从readelf的输出我们看到第2segment的偏移是0x1f04,所以经过相加得到第2segment的加载地址为0x12ff04。同样的道理,因为变量myglob在共享库中的的偏移是0x200c,所以经计算其加载地址为0x13000c

 

So, let’s ask GDB:

让我们用GDB验证一下:

(gdb) p &myglob$1 = (int *) 0x13000c


 Excellent! But what about the code of ml_func which refers tomyglob? Let’s ask GDB again:

果然如此!那么函数ml_func又从哪里引用变量myglob呢?让我们再次用GDB来验证一下:

(gdb) set disassembly-flavor intel(gdb) disas ml_funcDump of assembler code for function ml_func:   0x0012e46c <+0>:   push   ebp   0x0012e46d <+1>:   mov    ebp,esp   0x0012e46f <+3>:   mov    eax,ds:0x13000c   0x0012e474 <+8>:   add    eax,DWORD PTR [ebp+0x8]   0x0012e477 <+11>:  mov    ds:0x13000c,eax   0x0012e47c <+16>:  mov    eax,ds:0x13000c   0x0012e481 <+21>:  add    eax,DWORD PTR [ebp+0xc]   0x0012e484 <+24>:  pop    ebp   0x0012e485 <+25>:  retEnd of assembler dump.

As expected, the real address of myglob was placed in all themov instructions referring to it, just as the relocation entries specified.

正如预期的一样,变量myglob的实际地址成为了mov指令的操作数。

 

Relocating function calls

So far this article demonstrated relocation of data references – using the global variablemyglob as an example. Another thing that needs to be relocated is code references – in other words, function calls. This section is a brief guide on how this gets done. The pace is much faster than in the rest of this article, since I can now assume the reader understands what relocation is all about.

到现在为止,这篇文章一直演示的都是对数据的引用 —— 使用全局变量myglob作为例子,那么对指令的引用呢?—— 换句话说,就是对函数的调用。好,这一节就让我们来看看共享库中对函数的引用在装载时是如何重定位的。不过讲述的速度要快些了,因为我相信大家对什么是重定位已经有了一定的认识了。

 

Without further ado, let’s get to it. I’ve modified the code of the shared library to be the following:

废话不多说了,我们开始。我已经修改了前面的程序,如下:

int myglob = 42;int ml_util_func(int a){    return a + 1;}int ml_func(int a, int b){    int c = b + ml_util_func(a);    myglob += c;    return b + myglob;}

ml_util_func was added and it’s being used byml_func. Here’s the disassembly ofml_func in the linked shared library:

新的程序新添了ml_util_func函数,这个函数会被ml_func函数调用。下面是函数ml_func的反汇编代码:

000004a7 <ml_func>: 4a7:   55                      push   ebp 4a8:   89 e5                   mov    ebp,esp 4aa:   83 ec 14                sub    esp,0x14 4ad:   8b 45 08                mov    eax,DWORD PTR [ebp+0x8] 4b0:   89 04 24                mov    DWORD PTR [esp],eax 4b3:   e8 fc ff ff ff          call   4b4 <ml_func+0xd> 4b8:   03 45 0c                add    eax,DWORD PTR [ebp+0xc] 4bb:   89 45 fc                mov    DWORD PTR [ebp-0x4],eax 4be:   a1 00 00 00 00          mov    eax,ds:0x0 4c3:   03 45 fc                add    eax,DWORD PTR [ebp-0x4] 4c6:   a3 00 00 00 00          mov    ds:0x0,eax 4cb:   a1 00 00 00 00          mov    eax,ds:0x0 4d0:   03 45 0c                add    eax,DWORD PTR [ebp+0xc] 4d3:   c9                      leave 4d4:   c3                      ret

What’s interesting here is the instruction at address 0x4b3 – it’s the call to ml_util_func. Let’s dissect it:

有趣的是地址0x4b3处的指令 —— 这是调用ml_util_func函数的指令,让我们仔细分析它:

 

e8 is the opcode for call. The argument of this call is the offset relative to the next instruction. In the disassembly above, this argument is0xfffffffc, or simply-4. So the call currently points to itself. This clearly isn’t right – but let’s not forget about relocation. Here’s what the relocation section of the shared library looks like now:

e8是指令call的指令码,其操作数是相对于下一条指令的偏移量。从上面的反汇编可以看到,操作数是0xfffffffc —— -4的补码,所以得出的结论是这条call指令指向的是自己 ——这很显然是不正确的。但是我们不要忘了重定位,重定位会解决这个问题。接下来我们一起看看现在共享库的重定位入口的情况:

$ readelf -r libmlreloc.soRelocation section '.rel.dyn' at offset 0x324 contains 8 entries: Offset     Info    Type            Sym.Value  Sym. Name00002008  00000008 R_386_RELATIVE000004b4  00000502 R_386_PC32        0000049c   ml_util_func000004bf  00000401 R_386_32          0000200c   myglob000004c7  00000401 R_386_32          0000200c   myglob000004cc  00000401 R_386_32          0000200c   myglob[...] skipping stuff

If we compare it to the previous invocation of readelf -r, we’ll notice a new entry added for ml_util_func. This entry points at address 0x4b4 which is the argument of thecall instruction, and its type isR_386_PC32. This relocation type is more complicated thanR_386_32, but not by much.

如果我们比较前后两次readelf -r的输出结果的话,我们就会注意到这次的输出中多了一项关于对函数ml_util_func的引用的重定位入口。它的偏移量是0x4b4,这个数字正是call指令操作数的地址,不过它的重定位类型有所不同,是R_386_PC32类型的,这个重定位类型比R_386_32类型稍微复杂一些。

 

It means the following: take the value at the offset specified in the entry, add the address of the symbol to it, subtract the address of the offset itself, and place it back into the word at the offset. Recall that this relocation is done at load-time, when the final load addresses of the symbol and the relocated offset itself are already known. These final addresses participate in the computation.

R_386_PC32重定位类型的意思是:先取出指定偏移处的值,与符号的实际地址相加,然后减去偏移值,最终的结果放回偏移处。回想一下,这个重定位过程是在加载时完成的,当共享库加载进内存后,那么符号的实际地址和偏移值都是已知的了,那么就很容易算出结果。

 

What does this do? Basically, it’s a relative relocation, taking its location into account and thus suitable for arguments of instructions with relative addressing (which thee8 call is). I promise it will become clearer once we get to the real numbers.

到底是如何做的呢?首先要知道的是这是个相对位移重定位,对应于相应的相对寻址(e8 call就是相对寻址)。我相信用真实的数据来演示一定会说清楚。

 

I’m now going to build the driver code and run it under GDB again, to see this relocation in action. Here’s the GDB session, followed by explanations:

重新生成程序driver并且在GDB下运行,实际地看下重定位过程。

$ gdb -q driver Reading symbols from driver...done. (gdb) b driver.c:31 Breakpoint 1 at 0x804869e: file driver.c, line 31. (gdb) r Starting program: driver [...] skipping output name=./libmlreloc.so (6 segments) address=0x12e000               header  0: address=  0x12e000                       type=1, flags=0x5               header  1: address=  0x12ff04                       type=1, flags=0x6               header  2: address=  0x12ff18                       type=2, flags=0x6               header  3: address=  0x12e0f4                       type=4, flags=0x4               header  4: address=  0x12e000                       type=1685382481, flags=0x6               header  5: address=  0x12ff04                       type=1685382482, flags=0x4[...] skipping outputBreakpoint 1, main (argc=1, argv=0xbffff3d4) at driver.c:3131    }(gdb)  set disassembly-flavor intel(gdb) disas ml_util_funcDump of assembler code for function ml_util_func:   0x0012e49c <+0>:   push   ebp   0x0012e49d <+1>:   mov    ebp,esp   0x0012e49f <+3>:   mov    eax,DWORD PTR [ebp+0x8]   0x0012e4a2 <+6>:   add    eax,0x1   0x0012e4a5 <+9>:   pop    ebp   0x0012e4a6 <+10>:  retEnd of assembler dump.(gdb) disas /r ml_funcDump of assembler code for function ml_func:   0x0012e4a7 <+0>:    55     push   ebp   0x0012e4a8 <+1>:    89 e5  mov    ebp,esp   0x0012e4aa <+3>:    83 ec 14       sub    esp,0x14   0x0012e4ad <+6>:    8b 45 08       mov    eax,DWORD PTR [ebp+0x8]   0x0012e4b0 <+9>:    89 04 24       mov    DWORD PTR [esp],eax   0x0012e4b3 <+12>:   e8 e4 ff ff ff call   0x12e49c <ml_util_func>   0x0012e4b8 <+17>:   03 45 0c       add    eax,DWORD PTR [ebp+0xc]   0x0012e4bb <+20>:   89 45 fc       mov    DWORD PTR [ebp-0x4],eax   0x0012e4be <+23>:   a1 0c 00 13 00 mov    eax,ds:0x13000c   0x0012e4c3 <+28>:   03 45 fc       add    eax,DWORD PTR [ebp-0x4]   0x0012e4c6 <+31>:   a3 0c 00 13 00 mov    ds:0x13000c,eax   0x0012e4cb <+36>:   a1 0c 00 13 00 mov    eax,ds:0x13000c   0x0012e4d0 <+41>:   03 45 0c       add    eax,DWORD PTR [ebp+0xc]   0x0012e4d3 <+44>:   c9     leave   0x0012e4d4 <+45>:   c3     retEnd of assembler dump.(gdb)

The important parts here are:

  1. In the printout from driver we see that the first segment (the code segment) oflibmlreloc.so has been mapped to0x12e000[11]
  2. ml_util_func was loaded to address 0x0012e49c
  3. The address of the relocated offset is 0x0012e4b4
  4. The call in ml_func to ml_util_func was patched to place 0xffffffe4 in the argument (I disassembledml_func with the/r flag to show raw hex in addition to disassembly), which is interpreted as the correct offset toml_util_func.

重要的几部分在于:

1.从程序的输出中我们看到共享库libmlreloc.so的第1segment(文本段)被映射到内存0x12e000处。

2.函数ml_util_func的加载地址是0x0012e49c —— 即函数ml_util_func的实际地址。

3.需要重定位处的偏移是0x0012e4b4

4.函数ml_func对函数ml_util_func的调用的call指令的操作数被修正为0xffffffe4(我在反汇编函数ml_f时加入了/r选项,这回额外地输出程序字节码的十六进制形式),这个数字代表ml_util_func函数相对于call指令的下一条指令的偏移量。

 

Obviously we’re most interested in how (4) was done. Again, it’s time for some math. Interpreting theR_386_PC32 relocation entry mentioned above, we have:

明显的,我们都想知道上面的第4点是如何具体计算得到数字0xffffffe4。根据上面对重定位类型R_386_PC32的解释,我们的计算过程如下:

 

Take the value at the offset specified in the entry (0xfffffffc), add the address of the symbol to it (0x0012e49c), subtract the address of the offset itself (0x0012e4b4), and place it back into the word at the offset. Everything is done assuming 32-bit 2-s complement, of course. The result is0xffffffe4, as expected.

先取出指定偏移处的值,即0xffffffc —— -4的补码,和符号(这里是函数ml_util_func的地址)的实际地址0x0012e49c相加,相加的结果再减去偏移值0x0012e4b4,最后的结果放回偏移处。当然了,这里一切都假设是32位机的补码形式,最终得到0xffffffe4,正如期望的一样。

 

 

Extra credit: Why was the call relocation needed?

This is a "bonus" section that discusses some peculiarities of the implementation of shared library loading in Linux. If all you wanted was to understand how relocations are done, you can safely skip it.

最后我们谈谈Linux操作系统上共享库实现机制的特点。当然,如果你只想知道重定位是怎么一回事,那么你可以跳过这些。

 

When trying to understand the call relocation of ml_util_func, I must admit I scratched my head for some time. Recall that the argument ofcall is arelative offset. Surely the offset between thecall andml_util_func itself doesn’t change when the library is loaded – they both are in the code segment which gets moved as one whole chunk. So why is the relocation needed at all?

当试图弄明白调用函数ml_util_funccall指令的重定位时,我还是花了一些时间的。回想一下,这条call指令是相对位移调用指令。那么当共享库载入内存之后,call指令的下一条指令与函数ml_util_func的距离是不会变的 —— 因为共享库是作为一个整体载入内存的。那么问题是:为什么还要重定位呢?

 

Here’s a small experiment to try: go back to the code of the shared library, addstatic to the declaration ofml_util_func. Re-compile and look at the output ofreadelf-r again.

这里有个小实验可以尝试:将共享库中函数ml_util_func的属性改成static,重新编译生成共享库,并且用readelf -r查看共享库的重定位入口情况。

 

Done? Anyway, I will reveal the outcome – the relocation is gone! Examine the disassembly ofml_func – there’s now a correct offset placed as the argument ofcall – no relocation required. What’s going on?

关于函数ml_util_func的重定位消失了!查看生成的反汇编代码 —— 现在call指令的操作数是个正确的偏移量了。怎么不需要重定位了呢?怎么回事?!

 

When tying global symbol references to their actual definitions, the dynamic loader has some rules about the order in which shared libraries are searched. The user can also influence this order by setting theLD_PRELOAD environment variable.

当试图引用全局符号的实际定义的时候,动态链接器会按照一定的优先级顺序来搜索加载共享库。当然我们可以通过LD_PRELOAD这个环境变量来人为地改变这个优先级顺序。

 

There are too many details to cover here, so if you’re really interested you’ll have to take a look at the ELF standard, the dynamic loader man page and do some Googling. In short, however,whenml_util_func is global, it may be overridden in the executable or another shared library, so when linking our shared library, the linker can’t just assume the offset is known and hard-code it[12].It makes all references to global symbols relocatable in order to allow the dynamic loader to decide how to resolve them. This is why declaring the functionstatic makes a difference – since it’s no longer global or exported, the linker can hard-code its offset in the code.

这里牵扯到太多的细节了,所以如果你真的对这个很感兴趣的话,建议你去了解ELF格式标准,参阅动态链接器的man文档,以及去google。 简单来说,ml_util_func的属性是全局的时候,它有可能被可执行文件中另一个相同名字的符号或者另一个共享库中的相同名字的符号所代替,所以当动态链接器链接我们的共享库时,链接器不能假定偏移就是已知的。这一切导致的结果就是对全局符号的引用都需要重定位,为的就是让链接器去决定如何解析这些符号。 这也就解释了为什么函数ml_util_func的属性变成static的时候不需要重定位了 —— 因为不再是全局的符号或者被输出的符号,所以链接器可以认为偏移量是已知的,所以可以直接写入指令。

 

 

Extra credit #2: Referencing shared library data from the executable

Again, this is a bonus section that discusses an advanced topic. It can be skipped safely if you’re tired of this stuff.

In the example above, myglob was only used internally in the shared library. What happens if we reference it from the program (driver.c)? After all,myglob is a global variable and thus visible externally.

在上面的例子中,myglob仅仅只在共享库内使用,那么如果想在我们的程序(譬如driver.c中)使用会是什么情况呢? —— 毕竟,变量myglob是个全局变量,所以对外是可见的。

 

Let’s modify driver.c to the following (note I’ve removed the segment iteration code):

修改driver.c程序如下(注意:我已经去除segment iteration部分代码):

#include <stdio.h>extern int ml_func(int, int);extern int myglob;int main(int argc, const char* argv[]){    printf("addr myglob = %p\n", (void*)&myglob);    int t = ml_func(argc, argc);    return t;}

 

It now prints the address of myglob. The output is:

现在打印出变量myglob的地址,输出如下:

addr myglob = 0x804a018


Wait, something doesn’t compute here. Isn’t myglob in the shared library’s address space?0x804xxxx looks like the program’s address space. What’s going on?

等等,是不是出错了? 变量myglob不是存在于共享库的地址空间吗? 可是0x804xxxx看起来好像是在用户程序地址空间啊,这到底是什么一回事呢?

 

Recall that the program/executable is not relocatable, and thus its data addresses have to bound at link time. Therefore, the linker has to create a copy of the variable in the program’s address space, and the dynamic loader will usethat as the relocation address. This is similar to the discussion in the previous section – in a sense,myglob in the main program overrides the one in the shared library, and according to the global symbol lookup rules, it’s being used instead. If we examineml_func in GDB, we’ll see the correct reference made tomyglob:

我们知道如果我们的程序或者可执行文件中还有符号引用未重定位的话,那么这些符号的地址必须在动态链接时确定,所以链接器必须将共享库中的全局变量复制一份到我们的程序或者可执行文件的地址空间中,并且最终动态链接器就会使用这份复制的地址来作为最终重定位的地址。 这点与上一小节的函数重定位问题有些相似 —— 就是说,在main程序中的变量myglob替代了共享库中变量muglobmain程序中的变量myglob是强符号,共享库libmlreloc.so中的myglob是弱符号),最终链接器按照一定的优先级来解析这个全局符号。 如果我们在GDB中查看函数ml_func的汇编代码,我们就会看到引用的变量myglob正是main程序中的myglob

0x0012e48e <+23>:      a1 18 a0 04 08 mov    eax,ds:0x804a018


This makes sense because a R_386_32 relocation formyglob still exists inlibmlreloc.so, and the dynamic loader makes it point to the correct place wheremyglob now lives.

因为在共享库libmlreloc.so中变量myglob的重定位类型依然是R_386_32,所以肯定是动态链接器使对变量myglob的引用指向了myglob此时的真实位置(main程序中),明白了吧?

 

This is all great, but something is missing. myglob is initialized in the shared library (to 42) – how does this initialization value get to the address space of the program? It turns out there’s a special relocation entry that the linker builds into the program (so far we’ve only been examining relocation entries in the shared library):

上面的分析应该没错,那么现在的问题是变量myglob在共享库中是被初始化为42的,动态链接器是如何知道要复制这个值到我们程序的地址空间的呢?—— 原来在我们的程序中有个特别的重定位入口(目前为止,我们一直查看的都是共享库的重定位入口):

$ readelf -r driverRelocation section '.rel.dyn' at offset 0x3c0 contains 2 entries: Offset     Info    Type            Sym.Value  Sym. Name08049ff0  00000206 R_386_GLOB_DAT    00000000   __gmon_start__0804a018  00000605 R_386_COPY        0804a018   myglob[...] skipping stuff


Note the R_386_COPY relocation for myglob. It simply means: copy the value from the symbol’s address into this offset. The dynamic loader performs this when it loads the shared library. How does it know how much to copy? The symbol table section contains the size of each symbol; for example the size for myglob in the .symtab section of libmlreloc.so is 4.

可以看到现在变量myglob的重定位类型变成了R_386_COPY类型了,意思就是说,复制变量的值到这个指定的偏移处。 动态链接器一看到重定位类型是R_386_COPY时,就会在加载时完成值的复制。 可是链接器是如何知道要复制多少字节的数据的呢?这很容易解决,因为在符号表中标注了每个符号的大小。这里,变量myglob在符号表.symtab的大小被标注为4,也就是4个字节,所以链接在复制时就知道要复制4个字节大小的数据了。

 

I think this is a pretty cool example that shows how the process of executable linking and loading is orchestrated together. The linker puts special instructions in the output for the dynamic loader to consume and execute.

本文很好得向我们演示了程序运行时的链接与加载共享库的交互过程。两者相辅相成,最终我们的程序得以顺利的执行。

Conclusion

Load-time relocation is one of the methods used in Linux (and other OSes) to resolve internal data and code references in shared libraries when loading them into memory. These days, position independent code (PIC) is a more popular approach, and some modern systems (such as x86-64) no longer support load-time relocation.

Linux操作系统上,当加载共享库到内存中时,解析内部的数据引用或者指令引用的办法主要有两种,装载时重定位只是其中之一,现在更流行的办法是位置无关代码(PIC),并且有些系统(譬如X86-64)已经不支持装载时重定位机制了。

 

Still, I decided to write an article on load-time relocation for two reasons. First, load-time relocation has a couple of advantages over PIC on some systems, especially in terms of performance. Second, load-time relocation is IMHO simpler to understand without prior knowledge, which will make PIC easier to explain in the future. (Update 03.11.2011:the article about PIC was published)

然而,我依然坚持写本文的理由有两点。 第一点,相对位置无关代码(PIC)来说,装载时重定位技术在有些系统上有很多优势,尤其在性能方面。 第二点,装载时重定位技术更容易理解,并且理解了它,那么理解位置无关代码(PIC)时就变得容易了。

 

Regardless of the motivation, I hope this article has helped to shed some light on the magic going behind the scenes of linking and loading shared libraries in a modern OS.

无论如何,我希望本文能帮助你更好地理解现在操作系统下的链接与加载技术。

 

 

[1]

For some more information about this entry point, see the section "Digression – process addresses and entry point" of this article.

[2]

Link-time relocationhappens in the process of combining multiple object files into an executable (or shared library). It involves quite a lot of relocations to resolve symbol references between the object files. Link-time relocation is a more complex topic than load-time relocation, and I won’t cover it in this article.

[3]

This can be made possible by compiling all your libraries into static libraries (with ar combining object files instead gcc -shared), and providing the -static flag to gcc when linking the executable – to avoid linkage with the shared version of libc.

[4]

mlsimply stands for "my library". Also, the code itself is absolutely non-sensical and only used for purposes of demonstration.

[5]

Also called "dynamic linker". It’s a shared object itself (though it can also run as an executable), residing at /lib/ld-linux.so.2 (the last number is the SO version and may be different).

[6]

If you’re not familiar with how x86 structures its stack frames, this would be a good time to read this article.

[7]

You can provide the -l flag to objdump to add C source lines into the disassembly, making it clearer what gets compiled to what. I’ve omitted it here to make the output shorter.

[8]

I’m looking at the left-hand side of the output of objdump, where the raw memory bytes are. a1 00 00 00 00 means mov to eax with operand 0x0, which is interpreted by the disassembler as ds:0x0.

[9]

So ldd invoked on the executable will report a different load address for the shared library each time it’s run.

[1]

Experienced readers will probably note that I could ask GDB about i shared to get the load-address of the shared library. However, i shared only mentions the load location of the whole library (or, even more accurately, its entry point), and I was interested in the segments.

[10]

What, 0x12e000 again? Didn’t I just talk about load-address randomization? It turns out the dynamic loader can be manipulated to turn this off, for purposes of debugging. This is exactly what GDB is doing.

[12]

Unless it’s passed the -Bsymbolic flag. Read all about it in the man page of ld.

Related posts:

Position Independent Code (PIC) in shared libraries on x64

Position Independent Code (PIC) in shared libraries

Understanding the x64 code models

How statically linked programs run on Linux

  5Shared counter with Python’s multiprocessing

原创粉丝点击