The magic of LD_PRELOAD for Userland Rootkits

来源:互联网 发布:sql导入sybase数据 编辑:程序博客网 时间:2024/05/01 10:34

http://fluxius.handgrep.se/2011/10/31/the-magic-of-ld_preload-for-userland-rootkits/

How much can you trust binaries you are running, even if you had analyzed them before compilation? With less privileges than kernel rootkits (explained in “Ring 0f Fire”), userland rootkits still represent a big threat for users. To see it, we will talk about an interesting technique to hook functions that are commonly used by programs on shared libraries.

First and foremost, we will introduce quickly the use of shared libraries to explain in the second time, the need of LD_PRELOAD’s trick. After that, we will see how to apply it for rootkit, its limits and the case of its detection, that is not surprising with some anti-rootkits.

Prerequisites:

  • Basics in Linux and ELF (read the analysis part of my last article),
  • a Linux,
  • a survival skill in C programming language,
  • your evil mind switched on (or just be cool!),
  • another default song: Ez3kiel – Via continium.

Here is the contents:

  • Shared libraries,
  • LD_PRELOAD in the wild,
    • Make and use your own library,
    • dlsym: Yo Hook Hook And A Bottle Of Rum!,
    • Limitations,
  • Userland rootkit,
    • Jynx-Kit,
    • Detection,


Shared libraries

As we should know, when a program starts, it loads shared libraries and links it to the process. The linking process is done by “ld-linux-x86-64.so.X” (or “ld-linux.so.X” for 32-bits) (Remember “The Art Of ELF”?), as follows:

fluxiux@handgrep:~$ readelf -l /bin/ls
[...]
  INTERP         0x0000000000000248 0x00000000004purposes00248 0x0000000000400248
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
[...]

Opposed to the static compilation, that could be heavy in your hard disk, shared libraries for dynamic linked binaries are used to factorize the program, thanks to the linking that makes function calls to point to a corresponding function in the shared library. You can list shared libraries needed by the program with the command “ldd”:

fluxiux@handgrep:~$ ldd /bin/ls
    linux-vdso.so.1 =>  (0x00007fff0bb9a000)
    libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f7842edc000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7842cd4000)
    libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x00007f7842acb000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7842737000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7842533000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f7843121000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7842314000)
    libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007f784210f000)

Let’s try with a little code named “toto”:

#include <stdio.h>
main()
{
        printf("huhu la charrue");
}

Compile it now in dynamic and in static:

fluxiux@handgrep:~$ gcc toto.c -o toto-dyn
fluxiux@handgrep:~$ gcc -static toto.c -o toto-stat
fluxiux@handgrep:~$ ls -l | grep "toto-"
-rwxr-xr-x  1 fluxiux fluxiux     8426 2011-10-28 23:21 toto-dyn
-rwxr-xr-x  1 fluxiux fluxiux   804327 2011-10-28 23:21 toto-stat

As we can see, “toto-stat” is almost 96 times more heavy than “toto-dyn”. Why?:

fluxiux@handgrep:~$ ldd toto-stat
    is not a dynamic executable

This approach is very flexible and sophisticated because we can[1]:

  • update libraries and still support programs that want to use older, non-backward-compatible versions of those libraries,
  • override specific libraries or even specific functions in a library when executing a particular program,
  • do all this while programs are running using existing libraries.

Shared libraries have a special convention, which is the “soname”. “soname” have a prefix “lib”, followed by the name of the library, then “.so” and a period + a version number whenever the interface has changed (has you can see on previous listings).

Now, let’s talk about the LD_PRELOAD trick.

LD_PRELOAD in the wild

As you can see, libraries are generally present in “/lib” folder. So if we want to patch some libraries like the “libc” one, the first idea is to modify the sources and recompile everything into a shared library with the “soname” convention. But instead of doing this, we could use a wonderful trick that Linux offers to us: LD_PRELOAD.

Use your own library

Suppose we want to change the “printf” function, without recompiling the whole source. To do that, we will overwrite this function in “my_printf.c” code:

#define _GNU_SOURCE
#include <stdio.h>

int printf(constchar*format, ...)
{
        exit(153);
}

Now we have to compile[2] this code into a shared library as follows:

fluxiux@handgrep:~$ gcc -Wall -fPIC -c -o my_printf.o my_printf.c
my_printf.c: In function ‘printf’:
my_printf.c:6:2: warning: implicit declaration of function ‘exit’
my_printf.c:6:2: warning: incompatible implicit declaration of built-in function ‘exit’
fluxiux@handgrep:~$ gcc -shared -fPIC -Wl,-soname -Wl,libmy_printf.so -o libmy_printf.so my_printf.o

To use this library, we overwrite the environment variable “LD_PRELOAD” with the absolute path of “libmy_printf.so” library to execute our function, instead of glibc’s one:

fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so
fluxiux@handgrep:~$ ./toto-dyn

As we can see, the string “huhu la charrue” didn’t showed up, so we will trace library calls with “ltrace” to see what happen:

fluxiux@handgrep:~$ ltrace ./toto-dyn
__libc_start_main(0x4015f4, 1, 0x7fffa88d0908, 0x402530, 0x4025c0 <unfinished ...>
printf("huhu la charrue" <unfinished ...>
+++ exited (status 153) +++

Incredible! Our library has been called in first by the environment variable “LD_PRELOAD”. But if we want to alter the behavior of the function “printf” without changing its aspect for users, do we have to rewrite the whole function only modigying few lines? No! It is possible to hook a function much more easier and discretely.

dlsym: Yo Hook Hook And A Bottle Of Rum!

The “libdl” introduced interesting functions like:

  • dlopen(): load a library,
  • dlsym(): give the pointer for a specified symbol,
  • dlclose(): unload a library.

Because libraries have been loaded at process launching, we will only need to get the pointer of the symbol “printf” to use the original function. But how to do it if we have an overwritten function? We use “RTLD_NEXT” as an argument to point to the original function called before:

[...]
        typeof(printf)*old_printf;
[...]        
        /*
                DO HERE SOMETHING VERY EVIL
        */

        old_printf = dlsym(RTLD_NEXT,"printf");
[...]

After that, we need to format the string passed in argument and call the original function with this formatted string (“huhu la charrue”), to be shown as expected:

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>
#include <stdlib.h>
#include <stdarg.h>

int printf(constchar*format, ...)
{
        va_list list;
        char *parg;
        typeof(printf)*old_printf;

        // format variable arguments
        va_start(list, format);
        vasprintf(&parg, format, list);
        va_end(list);

        /*
                DO HERE SOMETHING VERY EVIL
        */


        // get a pointer to the function "printf"
        old_printf = dlsym(RTLD_NEXT,"printf");
        (*old_printf)("%s", parg);// and we call the function with previous arguments

        free(parg);
}

We compile it:

fluxiux@handgrep:~$ gcc -Wall -fPIC -c -o my_printf.o my_printf.c
my_printf.c: In function ‘printf’:
my_printf.c:21:1: warning: control reaches end of non-void function
fluxiux@handgrep:~$ gcc -shared -fPIC -Wl,-soname -Wl,libmy_printf.so -ldl -o libmy_printf.so my_printf.o
fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so

And execute it:

fluxiux@handgrep:~$ ./toto-dyn
huhu la charrue

Wonderful! A user cannot expect that something evil is going on, when executing his own program now. But there are some limitations using the LD_PRELOAD trick.

Limitations

This trick is very good but limited. Indeed, if you try with the static version of “toto” (toto-stat), the kernel will just load each segment to the specified virtual address, then jump to the entry-point. It means that there is no linking process done by the program interpreter.

Moreover, if the SUID or SGID bit is set to “1″, the LD_PRELOAd will not work for some security reasons (Too bad!).

For more informations about “LD_PRELOAD”, I suggest you to read the article of Etienne Dublé[3] (in French), that inspirited me a lot to make this post.

Userland rootkit

Jynx-Kit

About 2 weeks ago, a new userland rootkit[4] have been introduced. This rootkit came with an automated bash script to install it easily and is undetected by rkhunter and chkrootkit. To know more about that, we will analyze it.

The interesting part is in “ld_poison.c”, where fourteen functions are hooked:

[...]
    old_fxstat = dlsym(RTLD_NEXT,"__fxstat");
    old_fxstat64 = dlsym(RTLD_NEXT,"__fxstat64");
    old_lxstat = dlsym(RTLD_NEXT,"__lxstat");
    old_lxstat64 = dlsym(RTLD_NEXT,"__lxstat64");
    old_open = dlsym(RTLD_NEXT,"open");
    old_rmdir = dlsym(RTLD_NEXT,"rmdir");
    old_unlink = dlsym(RTLD_NEXT,"unlink");
    old_unlinkat = dlsym(RTLD_NEXT,"unlinkat");
    old_xstat = dlsym(RTLD_NEXT,"__xstat");
    old_xstat64 = dlsym(RTLD_NEXT,"__xstat64");
    old_fdopendir = dlsym(RTLD_NEXT,"fdopendir");
    old_opendir = dlsym(RTLD_NEXT,"opendir");
    old_readdir = dlsym(RTLD_NEXT,"readdir");
    old_readdir64 = dlsym(RTLD_NEXT,"readdir64");
[...]

Randomly, have look to the ”open” function. As you can see a “__xstat” is performed to get file informations:

[...]
    struct stat s_fstat;
[...]
    old_xstat(_STAT_VER, pathname,&s_fstat);
[...]

After that, a comparison informations like Group ID, path, and “ld.so.preload” that we want to hide. If these informations match, the function doesn’t return any result:

[...]
    if(s_fstat.st_gid == MAGIC_GID || (strstr(pathname, MAGIC_DIR) != NULL) || (strstr(pathname, CONFIG_FILE) != NULL)) {
        errno = ENOENT;
        return -1;
    }
[...]

It is organized like this in every functions, and people are not supposed to notice any suspicious file or activity (like the back connect shell). But what about detection?

Detection

Surprising (or not), but this rootkit is undetected by rkhunter and chkrootkit. The reason is that these two anti-rootkit check for signs, and as we should know, this is not the best to do.

Indeed, for example, just clean the “LD_PRELOAD” variable and generate a “sha1sum” of “toto”, as follows:

fluxiux@handgrep:~$ sha1sum toto-dyn
a659c72ea5d29c9a6406f88f0ad2c1a5729b4cfa  toto-dyn
fluxiux@handgrep:~$ sha1sum toto-dyn > toto-dyn.sha1

And then set the “LD_PRELOAD” variable and check if the sum is correct:

fluxiux@handgrep:~$ export LD_PRELOAD=$PWD/libmy_printf.so
fluxiux@handgrep:~$ sha1sum -c toto-dyn.sha1
toto-dyn: OK

IT… IS… CORRECT???!

Exactly! We didn’t modified anything in the ELF file, so the checksum should be the same, and it is. If anti-rootkit like rkhunter work like that, the detection must fail. Other techniques are based on suspicious files, signs and port binding detection like in “chkrootkit”, but they failed too, because this type of rootkit is very flexible, and in Jynx we have a sort of port knocking to open the remote shell for our host.

To avoid these rootkits, you could check for any suspicious library specified in “LD_PRELOAD” or “/etc/ld.so.preload”. We know also that “dlsym” can be used to call the original function while altering it:

$ strace ./bin/ls
[...]
open("/home/fluxiux/blabla/Jynx-Kit/ld_poison.so", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\n\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=17641, ...}) = 0
mmap(NULL, 2109656, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5e1a586000
mprotect(0x7f5e1a589000, 2093056, PROT_NONE) = 0
mmap(0x7f5e1a788000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f5e1a788000
close(3)
[...]
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) = 3
[...]

And disassembling “ld_poison.so” file, we could see that there are many substitutions in functions, that could hide malicious files or activities. Looking for strings in the binaries, when it is not packed, could provide us some interesting clues (but get in minds also that packing is suspicious sometimes):

fluxiux@handgrep:~/blabla/Jynx-Kit$ strings ld_poison.so
[...]
libdl.so.2
[...]
dlsym
fstat
[...]
lstat hooked.
ld.so.preload
xochi <-- sounds familiar
[...]
/proc/%s <-- hmmm... strange!
[...]

A rootkit as Jynx-kit, proves that signing detection is just a hopeless way to protect us against technologies like rootkits. If you want to make it right, base your detection on heuristics.

To finish, there is also some interesting forensic tools that compare results with many techniques (“/bin/ps” output against “/proc”, “procfs” walking and “syscall”). Indeed, Security by default has provided a special analysis on Jynx-kit[5] that made me discover Unhide[6], that checks if there are no hidden processes and opened ports (brute-forcing all available TCP/UDP ports).

References & Acknowledgements

[1] Shared libraries – http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
[2] Static, Shared Dynamic and Loadable Linux Libraries – http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
[3] (French) Le monde merveilleux de LD_PRELOAD – Open Silicium Magazine #4
[4] Jynx-Kit LD_PRELOAD Rootkit Release – http://forum.blackhatacademy.org/viewtopic.php?id=186
[5] Análisis de Jynx (Linux Rootkit) – http://www.securitybydefault.com/2011/10/analisis-de-jynx-linux-rootkit.html
[6] Unhide – http://www.unhide-forensics.info