UPX elf to mem

来源:互联网 发布:淘宝刷好评兼职平台 编辑:程序博客网 时间:2024/05/20 05:55

           Decompressing ELF Directly to Memory on Linux/x86

        Copyright (C) 2000-2015 John F. Reiser jreiser@BitWagon.com


References:

  <elf.h>   definitions for the ELF file format

  /usr/src/linux/fs/binfmt_elf.c   what Linuxexecve() does with ELF

  objdump --private-headers a.elf  dump the Elf32_Phdr

  http://www.cygnus.com/pubs/gnupro/5_ut/b_Usingld/ldLinker_scripts.html

     how to construct unusual ELF using /bin/ld


There is exactly one immovable object:  In all of the Linux kernel,

only the execve() system call sets the initial value of "thebrk(0)",

the value that is manipulated by system call 45 (__NR_brk in

/usr/include/asm/unistd.h).  For "direct to memory" decompression,

there will be no execve() except for the execve() of the decompressor

program itself.  So, the decompressor program (which contains the

compressed version of the original executable) must have the same

brk() as the original executable.  So, the second PT_LOAD

ELF "segment" of the compressed program is used only to set thebrk(0).

See src/p_lx_elf.cpp, function PackLinuxElf32::generateElfHdr.

All of the decompressor's code, and all of the compressed image

of the original executable, reside in the first PT_LOAD of the

decompressor program.


The decompressor program stub is just under 2K bytes when linked.

After linking, the decompressor code is converted to an initialized

array, and #included into the compilation of the compressor;

see src/stub/l_le_n2b.h.  To make self-contained compressed

executables even smaller, the compressor also compresses all but the

startup and decompression subroutine of the decompressor itself,

saving a few hundred bytes.  The startup code first decompresses the

rest of the decompressor, then jumps to it.  A nonstandard linker

script src/stub/l_lx_elf86.lds places both the .text and .data

of the decompressor into the same PT_LOAD at 0x00401000.  The

compressor includes the compressed bytes of the original executable

at the end of this first PT_LOAD.


At runtime, the decompressed stub lives at 0x00400000.  In order for the

decompressed stub to work properly at an address that is different

from its link-time address, the compiled code must contain no absolute

addresses.  So, the data items in l_lx_elf.c must be only parameters

and automatic (on-stack) local variables; no global data, no static data,

and no string constants.  Use "size l_le_n2b.o l_6e_n2b.o" to check

that both data and bss have length zero.  Also, the '&' operator

may not be used to take the address of a function.


The address  0x00400000 was chosen to be out of the way of the usual

load address 0x08048000, and to minimize fragmentation in kernel

page tables; one page of page tables covers 4 MiB. The address

0x00401000 was chosen as 1 page up from a 64 KiB boundary, to

make the startup code and its constants smaller.


Decompression of the executable begins by decompressing the Elf32_Ehdr

and Elf32_Phdr, and then uses the Ehdr andPhdrs to control decompression

of the PT_LOAD segments.

Subroutine do_xmap() of src/stub/l_lx_elf.c performs the

"virtual execve()" using the compressed data as source, and stores

the decompressed bytes directly into the appropriate virtual addresses.


Before transferring control to the PT_INTERP "program interpreter",

minor tricks are required to setup the Elf32_auxv_t entries,

clear the free portion of the stack (to compensate for ld-linux.so.2

assuming that its automatic stack variables are initialized to zero),

and remove (all but 4 bytes of) the decompression program (and

compressed executable) from the address space.


As of upx-3.05, by default on Linux, upon decompression then one page

of the compressed executable remains mapped into the address space

of the process.  If all of the pages of the compressed executable are

unmapped, then the Linux kernel erases the symlink /proc/self/exe,

and this can cause trouble for the runtime shared library loader

expanding $ORIGIN in -rpath, or for application code that relies on

/proc/self/exe.  Use the compress-time command-line option

--unmap-all-pages to achieve that effect at run time. Upx-3.04

and previous versions did this by default with no option.  However,

too much other software erroneously assumes that /proc/self/exe

always exists.


On arm*-linux-elf there is no good address at which to retain one

page of the compressed executable.  Pages below the usual .p_vaddr

0x8000 (32KiB) are rejected by the kernel.  Using a page above the

original uncompressed brk(0) would require placing the entire initial

compressed program above uncompressed brk(0), which would significantly

increase the running brk(0); but too many programs break ifbrk(0)

moves.  Thus on arm*-linux-elf the compressed executable begins

with 0x8000==.p_vaddr, all pages mapped by execve() that are also

occupied by decompressed bytes are removed before overwriting, and

/proc/self/exe becomes a "(deleted)"symlink.  It might be possible

to preserve /proc/self/exe if the original uncompressed executable

were created with 0x9000==.p_vaddr (one page higher than the usual

0x8000) so that the compressed page mapped at 0x8000 would linger.

[This has not been tested.]


Linux stores the pathname argument that was specified to execve()

immediately after the '\0' which terminates the character string of the

last environment variable [as of execve()].  This is true for at least

all Linux 2.6, 2.4, and 2.2 kernels.  Linux kernel 2.6.29 and later

records a pointer to that character string in Elf32_auxv[AT_EXECFN].

The pathname is not "bound" to the file as strongly as /proc/self/exe

(the file may be changed without affecting the pathname), but the

pathname does provide some information.  The pathname may be relative

to the working directory, so look before any chdir().


The Elf formats for Linux add an environment variable named "   " [three

spaces] which saves the results of readlink("/proc/self/exe",,) before

the runtime stub unmaps all its pages.  As of 2006-10-03 this works

for linux/elf386 and linux/ElfAMD.

0 0
原创粉丝点击