解析数值二进制位的python工具

来源：互联网发布：关于网络利弊英语编辑：程序博客网时间：2024/04/29 00:16

在hack kernel的时候，遇到了GDT，GDT descriptor一共有64位，不同的位代表不同的含义。譬如它的值是0x00cf9a000000ffff，那么具体每位代表什么呢？

因此写了个python工具，来解析一个数值每位的含义。文本格式如下：

gdtFmt = [
         ("0:15,48:51", "limit", "the whole limit"),
         ("16:31, 32:39, 56:63", "base address", "the whole base address"),
        ("0:15", "limit", "lower 16-bit limit"),
         ("16:31", "Base Address", "lower 16-bit base address"),
        ("32:39", "Base Address", "middle 8-bit base address"),
        ("40", "A", "access information, whether it was read from(=0) or written to(=1) by the last access"),
        ("41", "type", "for date/stack segment it can be written to (=1), for code segment it can be read from(=1)"),
         ("42", "type", "for date/stack segment it indicates expansion direction, it grows downside(=1), for code segment, confirming"),
         ("43", "type", "whether it is code segment(=1) or it is a date/stack segment(=0)"),
         ("44", "type", "must be 1 for code/data segment"),
         ("45:46", "DPL", "descriptor privilege level, we are going to use 0-kernel privilege and 3-user privilege"),
        ("47", "P", "whether the segment is present"),
         ("48:51", "Limit", "middle 8-bit limit"),
        ("52", "U", "user defined"),
         ("53", "X", "not used"),
         ("54", "D", "whether handle instructions and data as 32-bit(=1) or 16-bit(=0)"),
         ("55", "G", "whether the limitation use unit 4K or 1 byte"),
         ("56:63", "Base Address", "higher 8-bit base address")]

每项格式为：(位域，名称, 描述），譬如 ("0:15,48:51", "limit", "the whole limit"),表示0到15位以及48到51位，表示的是这个描述符所代表的segment的总limit长度，所有格式放在一个python list里面，对于输入0x00cf9a000000ffff,输出结果为：

解析的数字位数：64, dec:58434644969848831, hex:cf9a000000ffff, bin:0000000011001111100110100000000000000000000000001111111111111111
bitRange:(0:15,48:51), b:11111111111111111111, d:1048575, h:0xfffff, name:limit, desc:the whole limit

bitRange:(16:31, 32:39, 56:63), b:00000000000000000000000000000000, d:0, h:0x0, name:base address, desc:the whole base address

bitRange:(0:15), b:1111111111111111, d:65535, h:0xffff, name:limit, desc:lower 16-bit limit

bitRange:(16:31), b:0000000000000000, d:0, h:0x0, name:Base Address, desc:lower 16-bit base address

bitRange:(32:39), b:00000000, d:0, h:0x0, name:Base Address, desc:middle 8-bit base address

bitRange:(40), b:0, d:0, h:0x0, name:A, desc:access information, whether it was read from(=0) or written to(=1) by the last access

bitRange:(41), b:1, d:1, h:0x1, name:type, desc:for date/stack segment it can be written to (=1), for code segment it can be read from(=1)

bitRange:(42), b:0, d:0, h:0x0, name:type, desc:for date/stack segment it indicates expansion direction, it grows downside(=1), for code segment, confirming

bitRange:(43), b:1, d:1, h:0x1, name:type, desc:whether it is code segment(=1) or it is a date/stack segment(=0)

bitRange:(44), b:1, d:1, h:0x1, name:type, desc:must be 1 for code/data segment

bitRange:(45:46), b:00, d:0, h:0x0, name:DPL, desc:descriptor privilege level, we are going to use 0-kernel privilege and 3-user privilege

bitRange:(47), b:1, d:1, h:0x1, name:P, desc:whether the segment is present

bitRange:(48:51), b:1111, d:15, h:0xf, name:Limit, desc:middle 8-bit limit

bitRange:(52), b:0, d:0, h:0x0, name:U, desc:user defined

bitRange:(53), b:0, d:0, h:0x0, name:X, desc:not used

bitRange:(54), b:1, d:1, h:0x1, name:D, desc:whether handle instructions and data as 32-bit(=1) or 16-bit(=0)

bitRange:(55), b:1, d:1, h:0x1, name:G, desc:whether the limitation use unit 4K or 1 byte

bitRange:(56:63), b:00000000, d:0, h:0x0, name:Base Address, desc:higher 8-bit base address

python代码如下：

#-*-:coding: utf-8 -*-

import os, sys

class NumUtils:
    def __init__(self):
        None

    def dec2x(self, decNum, base):
        if decNum == 0:
            return "0"

        result = []
        while decNum != 0:
            remainder = decNum % base
            result.append(remainder)
            decNum = decNum / base

        result.reverse()

        retStr = ""
        for i in range(len(result)):
            retStr += str(result[i])

        return retStr

    def x2dec(self, xNumStr, base):
        return int(xNumStr, base)



class BitFormatParser:
    def __init__(self):
        None

    def Parse(self, num, fmt, bitLen):
        numUtils = NumUtils()
        binStr = numUtils.dec2x(num, 2)

        binStrLen = len(binStr)
        lenDiff = bitLen - binStrLen
        if lenDiff < 0:
            print "err: num len is longer than bitlen!"
            return

        "补上0"
        for i in range(lenDiff):
            binStr = "0" + binStr

        print "解析的数字位数：%d, dec:%d, hex:%x, bin:%s" % (bitLen, num, num, binStr)

        "干活"
        self.do(binStr, num, fmt)

    def do(self, binStr, num, fmt):
        for item in fmt:
            (bitRange, name, description) = item
            #print "bitRange:%s, name:%s, description:%s" % (bitRange, name, description)

            reverseBinStr = self.ReverseStr(binStr)

            (resBinStr, resDecStr, resHexStr) = self.Slice(reverseBinStr, num, bitRange)
            print "bitRange:(%s), b:%s, d:%s, h:%s, name:%s, desc:%s/n" % (bitRange, resBinStr, resDecStr, resHexStr, name, description)

    def ReverseStr(self, content):
        res = ""
        for i in range(len(content)):
            res = content[i] + res

        return res


    def Slice(self, binStr, num, bitRange):
        results = []
        bits = bitRange.split(",")
        for item in bits:
            item = item.strip()
            if item.find(":") != -1:
                beginAndEnd = item.split(":")
                begin = int(beginAndEnd[0])
                end = int(beginAndEnd[1])

                "有点像stl的iterator"
                partBinStr = binStr[begin:(end+1)]
                results.append(self.ReverseStr(partBinStr))
            else:
                partBinStr = binStr[int(item)]
                results.append(partBinStr)


        results.reverse()
        resBinStr = ""
        for i in range(len(results)):
            resBinStr += results[i]

        intDec = NumUtils().x2dec(resBinStr,2)
        resDecStr = str(intDec)
        resHexStr = hex(intDec)

        return (resBinStr, resDecStr, resHexStr)


strategy = """
    将数据转变成位的字符串，然后依次取值，然后打印出来(值，name, description),值同时转换成10进制和16进制
"""

if __name__ == "__main__":
    """0:3ver(low ver), 4:10ver(middle ver), 11:ver(highest bit ver)"""

    f1 = [
    ("0:3", "ver", "low ver"),
    ("4:10", "ver", "middle ver"),
    ("11", "ver", "highest bit ver"),
    ("0:3, 4:10, 11", "ver", "the whole ver")]

    numUtilsTest = False
    if numUtilsTest:
        numUtils = NumUtils()
        str5 = numUtils.dec2x(5, 2)
        str12 = numUtils.dec2x(12, 2)
        print "str5:%s, str12:%s" % (str5, str12)

    f1UnitTest = False
    if f1UnitTest:
        parser = BitFormatParser()
        parser.Parse(5, f1, 12)

    gdtFmt = [
        ("0:15,48:51", "limit", "the whole limit"),
        ("16:31, 32:39, 56:63", "base address", "the whole base address"),
        ("0:15", "limit", "lower 16-bit limit"),
        ("16:31", "Base Address", "lower 16-bit base address"),
        ("32:39", "Base Address", "middle 8-bit base address"),
        ("40", "A", "access information, whether it was read from(=0) or written to(=1) by the last access"),
        ("41", "type", "for date/stack segment it can be written to (=1), for code segment it can be read from(=1)"),
        ("42", "type", "for date/stack segment it indicates expansion direction, it grows downside(=1), for code segment, confirming"),
        ("43", "type", "whether it is code segment(=1) or it is a date/stack segment(=0)"),
        ("44", "type", "must be 1 for code/data segment"),
        ("45:46", "DPL", "descriptor privilege level, we are going to use 0-kernel privilege and 3-user privilege"),
        ("47", "P", "whether the segment is present"),
        ("48:51", "Limit", "middle 8-bit limit"),
        ("52", "U", "user defined"),
        ("53", "X", "not used"),
        ("54", "D", "whether handle instructions and data as 32-bit(=1) or 16-bit(=0)"),
        ("55", "G", "whether the limitation use unit 4K or 1 byte"),
        ("56:63", "Base Address", "higher 8-bit base address")]

    gdtUnitTest = True
    if gdtUnitTest:
        parser = BitFormatParser()

parser.Parse(0x00cf9a000000ffff, gdtFmt, 64)

上面使用的是我已知的所有python知识来写的，使用的是python2.5, 不是 3k版本。我的python代码完全可以当作反面例子:)

当时写这段代码时身边没有网络，也就不能google了。有一些问题当初没有解决：

1. string的反转有没有系统api?
L[::-1]
http://www.python.org/doc/2.3.5/whatsnew/section-slices.html

2. 类的静态成员变量和成员函数如何定义和使用？

#-*-:coding:utf-8-*-
2
3 class Test:
4     numInstances = 0
5     def __init__(self):
6         Test.numInstances += 1
7
8     def PrintNumInstances():
9         print "Number of instances:", Test.numInstances
10
11     PrintNumInstances = staticmethod(PrintNumInstances)
12
13 if __name__ == "__main__":
14     a = Test()
15     b = Test()
16     c = Test()
17
18     Test.PrintNumInstances()
19     a.PrintNumInstances()
20     b.PrintNumInstances()

3. 如何将一个str list里的一行代码就叠加起来
l = ['a', 'b', 'c', 'd']
print ''.join(l) => "abcd"
print ':".join(l) => "a:b:c:d"

4. 如何更方便的对list做循环操作
这点当初是想干什么的有点忘了