ctypes使用指南

来源：互联网发布：上海烟草网络销售网编辑：程序博客网时间：2024/06/17 01:29

ctypes使用指南

1 ctypes简介

从Python2.5开始引入。

ctypes是Python的外部函数库。它提供了C兼容的数据类型，并且允许调用动态链接库/共享库中的函数。它可以将这些库包装起来给Python使用。

2 ctypes入门

本入门中的代码使用doctest确保可用。不过一些代码在linux/windows/mac os x中的行为可能略有差异，这在其doctest的注释中有所表示。

少数代码示例引用了ctypes的c_int类型。这个类型是32bit系统中c_long类型的别名。所以你在期待c_int而显示c_long时不必疑惑，他们是一样的。

2.1 载入动态链接库

ctypes导出了 cdll，在windows上还有 windll 和 oledll 对象用于载入动态链接库。

载入动态链接库可以直接存取其属性。 cdll 载入导出函数符合cdecl调用规范的库，而 windll 载入导出函数符合 stdcall 调用规范的库， oledll 也使用 stdcall 调用规范，并假设函数返回Windows的HRESULT错误码。错误码用于在出错时自动抛出WindowsError这个Python异常。

如下是Windows的例子，主意msvcrt是MS标准C库，包含了大部分标准C函数，并且使用cdecl调用规范:

>>> from ctypes import *

>>> print windll.kernel32

>>> print cdll.msvcrt

>>> libc=cdll.msvcrt

>>>

Windows通常使用".dll"作为动态链接库的扩展名。

Linux上需要指定包含扩展名的文件名来载入动态库，所以属性存取方式就失效了。你可以使用 LoadLibrary 方法，或者创建CDLL的实例来载入:

>>> cdll.LoadLibrary("libc.so.6")

>>> libc==CDLL("libc.so.6")

>>> libc

>>>

2.2 从载入的动态链接库中访问函数

函数是作为dll对象的属性来存取的:

>>> from ctypes import *

>>> libc.printf

<_FuncPtr object at 0x...>

>>> print windll.kernel32.GetModuleHandleA

<_FuncPtr object at 0x...>

>>> print windll.kernel32.MyOwnFunction

Traceback (most recent call last):

File "<stdin>", line 1, in ?

File "ctypes.py", line 239, in __getattr__

func = _StdcallFuncPtr(name,self)

AttributeError: function 'MyOwnFunction' not found

>>>

注意win32系统动态链接库，如kernel32和user32经常同时导出ANSI和UNICODE版本的函数。UNICODE版本的会在名字末尾加"W"，而ANSI版本的加上"A"。Win32版本的 GetModuleHandle 函数，返回给定模块名的句柄，有如下C原型，还有一个宏用于暴露其中一个作为 GetModuleHandle ，依赖于UNICODE定义与否:

/* ANSI version */

HMODULE GetModuleHandleA(LPCSTR lpModuleName);

/* UNICODE version */

HMODULE GetModuleHandleW(LPCWSTR lpModuleName);

windll 并不会自动选择调用某个版本，所以你必须指定要调用的，传递的时候也要指定正确的字符串参数类型。

有时动态链接库导出函数并不是有效的Python标识符，例如 "??2@YAPAXI@Z" 。这种情况下，你必须使用getattr 获取函数:

>>> getattr(cdll.msvcrt,"??2@YAPAXI@Z")

<_FuncPtr object at 0x...>

>>>

在Windows上，有些动态链接库导出函数不是用名字，而是用序号(ordinal)。这些函数通过索引存取:

>>> cdll.kernel32[1]

<_FuncPtr object at 0x...>

>>> cdll.kernel32[0]

Traceback (most recent call last):

File "<stdin>", line 1, in ?

File "ctypes.py", line 310, in __getitem__

func = _StdcallFuncPtr(name,self)

AttributeError: function ordinal 0 not found

>>>

2.3 调用函数

你可以像正常的Python函数一样调用这些函数。这里用 time() 函数示例，返回Unix epoch系统时间，和GetModuleHandleA() 函数，返回win32模块句柄。

这个例子调用函数时附带NULL指针(None作为NULL指针):

>>> print libc.time(None)

1150640792

>>> print hex(windll.kernel32.GetModuleHandleA(None))

0x1d000000

>>>

在调用函数时，如果使用了错误的参数数量和调用规范时，ctypes尝试保护调用。不幸的是该功能仅在Windows上有用。它通过检查函数返回栈来实现，所以尽管发生了错误，但是函数还是调用了:

>>> windll.kernel32.GetModuleHandleA()

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ValueError: Procedure probably called with not enough argument (4 bytes missing)

>>> windll.kernel.GetModuleHandleA(0,0)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ValueError: Procedure probably called with too many argument (4 bytes in excess)

这在你使用了错误的调用规范时同样会发生:

>>> cdll.kernel32.GetModuleHandleA(None) # doctest: +WINDOWS

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ValueError: Procedure probably called with not enough arguments (4 bytes missing)

>>>

>>> windll.msvcrt.printf("spam") # doctest: +WINDOWS

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ValueError: Procedure probably called with too many arguments (4 bytes in excess)

>>>

想要找到正确的调用规范，你必须查看C头文件或者函数的文档。

在Windows，ctypes使用win32结构异常处理，避免无保护的挂掉:

>>> windll.kernel32.GetModuleHandleA(32)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

WindowsError: exception: access violation reading 0x00000020

尽管如此，仍然有很多方法用ctypes挂掉Python，所以你必须很小心的使用。

None、整数、长整数、字节串和unicode字符串是可以作为本地Python对象直接传递给函数调用的。None是作为C的NULL指针，字节串和unicode字符串作为内存块指针传递(char* 或 wchar_t*)。Python整数和长整数作为平台相关的C类型传递。

在调用更多的函数之前，必须了解关于ctypes数据类型的知识。

2.4 基本数据类型

ctypes定义了一系列基本C数据类型：

ctypes类型

C类型

Python类型

c_char

char

1个字符的字符串

c_wchar

wchar_t

1个字符的unicode字符串

c_byte

char

int/long

c_ubyte

unsigned char

int/long

c_short

short

int/long

c_ushort

unsigned short

int/long

c_int

int

int/long

c_uint

unsigned int

int/long

c_long

long

int/long

c_ulong

unsigned long

int/long

c_longlong

__int64 或 long long

int/long

c_ulonglong

unsigned __int64 或 unsigned long long

int/long

c_float

float

c_double

double

float

c_char_p

char * (NUL结尾字符串)

string或None

c_wchar_p

wchar_t * (NUL结尾字符串)

unicode或None

c_void_p

void *

int/long或None

所有这些类型都可以通过调用可选传输初始化值方式指定值:

>>> c_int()

c_long(0)

>>> c_char_p("Hello, world")

c_char_p('Hello, world')

>>> c_ushort(-3)

c_ushort(65533)

>>>

这些类型都是可变的，其值也是随后可变的:

>>> i=c_int(42)

>>> print i

c_long(42)

>>> print i.value

>>> i.value=-99

>>> print i.value

-99

>>>

对指针类型 c_char_p/c_wchar_p/c_void_p 的赋值将会改变其指向的内存区域地址，而不是改变内存块的值(当然了，因为Python字符串是只读的):

>>> s="Hello, world"

>>> c_s=c_char_p(s)

>>> print c_s

c_char_p('Hello, world')

>>> c_s.value="Hi, there"

>>> print c_s

c_char_p('Hi, there')

>>> print s #第一个字符串没有改变

Hello, world

>>>

必须小心的是，不要传递这些的指针给可变内存。如果你需要可变内存块，ctypes提供了create_string_buffer() 函数。当前内存块可以存取或改变，如果你想要将其作为NUL结尾字符串方式，使用值的方法:

>>> from ctypes import *

>>> p = create_string_buffer(3) # create a 3 byte buffer, initialized to NUL bytes

>>> print sizeof(p), repr(p.raw)

3 '\x00\x00\x00'

>>> p = create_string_buffer("Hello") # create a buffer containing a NUL terminated string

>>> print sizeof(p), repr(p.raw)

6 'Hello\x00'

>>> print repr(p.value)

'Hello'

>>> p = create_string_buffer("Hello", 10) # create a 10 byte buffer

>>> print sizeof(p), repr(p.raw)

10 'Hello\x00\x00\x00\x00\x00'

>>> p.value = "Hi"

>>> print sizeof(p), repr(p.raw)

10 'Hi\x00lo\x00\x00\x00\x00\x00'

>>>

create_string_buffer() 函数已经替换了 c_buffer() 函数(仍然作为别名存在)，有如 c_string() 函数以前，只是出现在以前的版本中。想要创建包含unicode字符(对应C类型wchar_t)的可变内存块，使用create_unicode_buffer() 函数。

2.5 调用函数，继续

需要注意的是，printf打印到真实的标准输出，而不是 sys.stdout ，所以这些例子仅在控制台模式有效，而不是IDLE或PythonWin:

>>> printf=libc.printf

>>> printf("Hello, %s\n","World!")

Hello, World!

>>> printf("Hello, %S", u"World!")

Hello, World!

>>> printf("%d bottles of beer\n", 42)

42 bottles of beer

>>> printf("%f bottles of beer\n", 42.5)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ArgumentError: argument 2: exceptions.TypeError: Don't know how to convert parameter 2

>>>

有如前面所说，除了整数、字符串和unicode字符串以外的Python类型必须使用ctypes类型做包装，所以他们可以转换为必须的C数据类型:

>>> printf("An int %d, a double %f\n",1234,c_double(3.14))

Integer 1234, double 3.1400001049

>>>

2.6 使用自定义数据类型调用函数

你可以使用自定义ctypes参数转换，允许你自己的类作为函数参数。ctypes寻找对象的 _as_parameter_ 属性，并将其作为函数参数。当然，必须是整数、字符串或unicode

>>> class Bottles(object):

... def __init__(self, number):

... self._as_parameter_ = number

...

>>> bottles = Bottles(42)

>>> printf("%d bottles of beer\n", bottles)

42 bottles of beer

>>>

如果你不想存储实例的数据到 _as_parameter_ 实例变量，你可以定义一个属性确保数据有效。

2.7 指定必须的参数类型(函数原型)

可以通过指定函数的 argtypes 属性来指定函数的参数类型。

argtypes必须是一个C数据类型序列(printf函数在这里不是个好例子，因为它需要依赖于格式化字符串的可变数量和多种类型的参数，反过来说倒是很适合于练手):

>>> printf.argtypes=[c_char_p,c_char_p,c_int,c_double]

>>> printf("String '%s', Int %d, Double %f\n","Hi",10,2.2)

String 'Hi', Int 10, Double 2.200000

>>>

指定不兼容的参数类型，和尝试转换参数到到无效类型会出错:

>>> printf("%d %d %d", 1, 2, 3)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ArgumentError: argument 2: exceptions.TypeError: wrong type

>>> printf("%s %d %f", "X", 2, 3)

X 2 3.00000012

>>>

如果你自定义的类要传递给函数调用，必须实现 from_param 类方法，才能在argtypes序列中使用。from_param 类方法接收Python对象传递到函数调用，需要做类型检查或者其他确保对象可以被接受的工作，然后返回对象本身， _as_parameter_ 属性，或者你想要传递给C函数的参数。再次说明，返回结果必须是整数、字符串、unicode、ctypes实例，或者任何有 _as_parameter_ 属性的东西。

2.8 返回类型

缺省情况假设函数返回C的int类型。其他返回类型可以通过设置函数的 restype 属性来实现。

这里是一个更高级的例子，它使用strchr函数，需要一个字符串指针和一个字符，返回字符串的指针:

>>> strchr = libc.strchr

>>> strchr("abcdef", ord("d")) # doctest: +SKIP

8059983

>>> strchr.restype = c_char_p # c_char_p is a pointer to a string

>>> strchr("abcdef", ord("d"))

'def'

>>> print strchr("abcdef", ord("x"))

None

>>>

如果你想要上面的 ord("x") 调用，你可以设置argtypes属性，而第二个参数的Python字符串会转换成C字符:

>>> strchr.restype = c_char_p

>>> strchr.argtypes = [c_char_p, c_char]

>>> strchr("abcdef", "d")

'def'

>>> strchr("abcdef", "def")

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ArgumentError: argument 2: exceptions.TypeError: one character string expected

>>> print strchr("abcdef", "x")

None

>>> strchr("abcdef", "d")

'def'

>>>

你还可以使用Python的可调用对象(函数或者类的例子)作为restype属性，如果外语函数返回整数。这时在C函数调用结束后会使用其返回的整数调用这个Python可调用对象，而返回值作为函数调用的返回值。相当于对C函数返回值做了包装。这对于检查错误码而抛出异常的情况非常有用:

>>> GetModuleHandle = windll.kernel32.GetModuleHandleA # doctest: +WINDOWS

>>> def ValidHandle(value):

... if value == 0:

... raise WinError()

... return value

...

>>>

>>> GetModuleHandle.restype = ValidHandle # doctest: +WINDOWS

>>> GetModuleHandle(None) # doctest: +WINDOWS

486539264

>>> GetModuleHandle("something silly") # doctest: +WINDOWS

Traceback (most recent call last):

File "<stdin>", line 1, in ?

File "<stdin>", line 3, in ValidHandle

WindowsError: [Errno 126] The specified module could not be found.

>>>

这里的 WinError 是一个函数，会调用Windows的 FormatMessage() API来获取错误码的字符串描述，并且返回异常。 WinError 接受可选的错误码参数，如果没有指定则调用 GetLastError() 获取。

需要注意的是强大的错误检查机制是通过 errcheck 属性实现的。具体查看手册了解细节。

2.9 传递指针(或者传递参数引用)

有时C函数需要一个指针指向的数据作为参数，还有可能是想向里面写的位置，或者数据太大不适合传递。这也叫做传递参数引用。

ctypes导出 byref() 函数用于传递参数引用。同样也可以用于指针函数，尽管指针对象可以做很多工作，但是如果你并不需要在Python中使用指针对象的话，使用 byref() 会更快:

>>> i = c_int()

>>> f = c_float()

>>> s = create_string_buffer('\000' * 32)

>>> print i.value, f.value, repr(s.value)

0 0.0 ''

>>> libc.sscanf("1 3.14 Hello", "%d %f %s",

... byref(i), byref(f), s)

>>> print i.value, f.value, repr(s.value)

1 3.1400001049 'Hello'

>>>

2.10 结构和联合

结构和联合必须继承自ctypes模块的 Structure 和 Union 类。每个子类必须定义 _fields_ 属性，该属性必须是2元素元组的列表，包含字段名和字段类型。

字段类型必须是ctypes类型，例如 c_int ，或者其他派生的ctypes类型：结构、联合、数组、指针。

这里有个POINT结构体的简单例子，包含两个整数叫做x和y，同时展示了如何构造结构体:

>>> from ctypes import *

>>> class POINT(Structure):

... _fields_ = [("x", c_int),

... ("y", c_int)]

...

>>> point = POINT(10, 20)

>>> print point.x, point.y

10 20

>>> point = POINT(y=5)

>>> print point.x, point.y

0 5

>>> POINT(1, 2, 3)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

ValueError: too many initializers

>>>

你还可以构造更多复杂的结构体。结构体可以自包含作为一个字段类型。

这里是一个RECT结构体，它包含了两个POINT结构体分别名为upperleft和lowerright：

>>> class RECT(Structure):

... _fields_ = [("upperleft", POINT),

... ("lowerright", POINT)]

...

>>> rc = RECT(point)

>>> print rc.upperleft.x, rc.upperleft.y

0 5

>>> print rc.lowerright.x, rc.lowerright.y

0 0

>>>

嵌套结构体可以通过下面多种方法初始化：

>>> r = RECT(POINT(1, 2), POINT(3, 4))

>>> r = RECT((1, 2), (3, 4))

域描述可以检索到类，这对调试有很大的帮助，因为它们可以提供到有用的信息：

>>> print POINT.x

>>> print POINT.y

>>>

2.11 结构/联合对齐和字节序

默认情况下结构体和联合的对齐使用C编译器相同的方式。这可以通过 _pack_ 类属性来重载其行为。这必须设置一个正数指定字段的最大对齐。这个功能与MSVC中的 #pragma pack(n) 功能一样。

ctypes中的结构体和联合使用本地字节序。想要用非本地字节序，可以使用 BigEndianStructure 、LittleEndianStructure 、 BigEndianUnion 、 LittleEndianUnion 基类。这些类无法包含指针字段。

2.12 结构与联合中的位字段

创建结构与联合体时，可以包含位字段。只有整型域才可以使用位字段，位宽可以在_fields_元组的第三个选项中指定：

>>> class Int(Structure):

... _fields_ = [("first_16", c_int, 16),

... ("second_16", c_int, 16)]

...

>>> print Int.first_16

>>> print Int.second_16

>>>

2.13 数组

数组就是序列，包含固定数量(fixed number of)的相同类型的实例。

推荐的创建数组类型的方式是使用正数和乘号应用到类型:

TenPointsArrayType=POINT*10

这里有个巧妙的例子，一个结构体包含一个字段有4个POINT:

>>> from ctypes import *

>>> class POINT(Structure):

... _fields_ = ("x", c_int), ("y", c_int)

...

>>> class MyStruct(Structure):

... _fields_ = [("a", c_int),

... ("b", c_float),

... ("point_array", POINT * 4)]

>>>

>>> print len(MyStruct().point_array)

>>>

可以通过下面的办法高效访问数组：

arr = TenPointsArrayType()

for pt in arr:

print pt.x, pt.y

上面的代码打印一行结果0 0，因为数组数据初始化为0。

可以通过下面的办法显式初始化：

>>> from ctypes import *

>>> TenIntegers = c_int * 10

>>> ii = TenIntegers(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

>>> print ii

<c_long_Array_10 object at 0x...>

>>> for i in ii: print i,

...

1 2 3 4 5 6 7 8 9 10

>>>

2.14 指针

指针实例使用 pointer() 函数:

>>> from ctypes import *

>>> i=c_int(42)

>>> pi=pointer(i)

>>>

指针实例有一个 contents 属性返回指针指向的内容对象，例如上面的例子:

>>> pi.contents

c_long(42)

>>>

注意ctypes没有OOR(Original Object Return原始对象返回)，他在你请求一个属性时构造一个新的、等同的对象:

>>> pi.contents is i

False

>>> pi.contents is pi.contents

False

>>>

给指针的contents属性赋值一个新的c_int实例会改变指针指向内容的内存地址：

>>> i = c_int(99)

>>> pi.contents = i

>>> pi.contents

c_long(99)

>>>

指针实例可以通过整数下标访问：

>>> pi[0]

>>>

也可以通过下标访问的方式来改变指针指向的内容：

>>> print i

c_long(99)

>>> pi[0] = 22

>>> print i

c_long(22)

>>>

你也可以使用非0下标访问，但你必须知道你在做什么，比如在C语言：你可以访问或改变任意的内存地址。一般情况下，你仅可以在收到一个C函数返回来的指针，并且你知道它是指向了一个数组时才可以使用这个特性。

指针函数不仅创建了指针实例，它还会先创建指针类型。这些就是指针函数POINTER的工作，它可以接受任何ctypes的类型，并返回一个新的指针：

>>> PI = POINTER(c_int)

>>> PI

>>> PI(42)

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: expected c_long instead of int

>>> PI(c_int(42))

<ctypes.LP_c_long object at 0x...>

>>>

NULL指针具有默认的布尔值False：

>>> null_ptr = POINTER(c_int)()

>>> print bool(null_ptr)

False

>>>

当访问或给NULL指针赋值时，会引发python类型检查异常：

>>> null_ptr[0]

Traceback (most recent call last):

....

ValueError: NULL pointer access

>>>

>>> null_ptr[0] = 1234

Traceback (most recent call last):

....

ValueError: NULL pointer access

>>>

2.15 类型转换

通常情况下，ctypes会做严格的类型检查。这意味着，如果形参有一个POINTER(c_int)指针指向一个函数或者结构体的成员域类型，那么实参只能接受相同类型的实例。但这个规则在ctypes处理其他对象时也有例外。比如，你可以传递兼容的数据类型来代替指针类型。所以，对于POINTER(c_int)指针类型来说，可以使用c_int数据来代替：

>>> class Bar(Structure):

... _fields_ = [("count", c_int), ("values", POINTER(c_int))]

...

>>> bar = Bar()

>>> bar.values = (c_int * 3)(1, 2, 3)

>>> bar.count = 3

>>> for i in range(bar.count):

... print bar.values[i]

...

>>>

可以通过给指针的values属性赋值为None来设置NULL指针：

>>> bar.values = None

>>>

XXX 列举其他的类型转换……

在C语言，你可以通过强制类型转换的方法来转换不兼容的类型。ctypes也提供了一个转换函数让你可以使用相同的方式进行类型转换。上面定义的Bar结构体中，它的value域可以支持POINTER(c_int)指针或者c_int数组，但不支持其他类型：

>>> bar.values = (c_byte * 4)()

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: incompatible types, c_byte_Array_4 instance instead of LP_c_long instance

>>>

在这种情况下，转换函数就方便多了。

转换函数可以将一个能转换成ctypes指针的实例转换成另外一个ctypes指针类型。转换函数需要两个参数，第一个是能够转换成指针类型的cytpes实例类型，第二个是ctypes指针类型。它返回第二个参数类型的实例，并且这个实例与第一个参数共用同一块内存：

>>> a = (c_byte * 4)()

>>> cast(a, POINTER(c_int))

<ctypes.LP_c_long object at ...>

>>>

所以，Bar结构的values域可以这样通过类型转换来赋值：

>>> bar = Bar()

>>> bar.values = cast((c_byte * 4)(), POINTER(c_int))

>>> print bar.values[0]

>>>

2.16 不完全的类型

不完全的类型包含结构体，联合体或者类型未指定的数组。在C语言中，它们可以这样先声明后定义：

struct cell; /* forward declaration */

struct {

char *name;

struct cell *next;

} cell;

直接这样转换成ctypes代码将会无效：

>>> class cell(Structure):

... _fields_ = [("name", c_char_p),

... ("next", POINTER(cell))]

...

Traceback (most recent call last):

File "<stdin>", line 1, in ?

File "<stdin>", line 2, in cell

NameError: name 'cell' is not defined

>>>

因为新类cell在类本身定义时是无效的。在ctypes，我们可以先定义cell类，然后再给它的_fields_属性赋值：

>>> from ctypes import *

>>> class cell(Structure):

... pass

...

>>> cell._fields_ = [("name", c_char_p),

... ("next", POINTER(cell))]

>>>

让我们试一下效果。我们创建两个cell的实例，然后让他们互相指向对方，然后尝试访问指针链表几次：

>>> c1 = cell()

>>> c1.name = "foo"

>>> c2 = cell()

>>> c2.name = "bar"

>>> c1.next = pointer(c2)

>>> c2.next = pointer(c1)

>>> p = c1

>>> for i in range(8):

... print p.name,

... p = p.next[0]

...

foo bar foo bar foo bar foo bar

>>>

2.17 回调函数

ctypes允许从python回调中创建c回调函数指针。这个常常被称为回调函数。

首先，你必须为回调函数创建一个类，这个类知道调用协议，函数返回值类型，函数接受的参数个数及类型。

CFUNCTYPE工厂函数使用普通cdecl调用协议来为回调函数创建类型。并且，在Windows平台，WINFUNCTYPE工厂函数使用stdcall调用协议来为回调函数创建类型。

这两个工厂函数在调用时，参数表都是使用返回值作为第一个参数，而将回调函数所需要的参数作为剩下的参数。

在这里我将使用一个c标准库里的快排函数作为演示例子，快排是一个借助回调函数进行排序的函数。快排将会用到下面的整型数组：

>>> IntArray5 = c_int * 5

>>> ia = IntArray5(5, 1, 7, 33, 99)

>>> qsort = libc.qsort

>>> qsort.restype = None

>>>

快排调用时需要一个待排序的原始数据指针，数组元素个数，单个元素的大小，以及一个被称为回调的比较函数指针。回调函数形参表需要两个待比较元素类型的指针，它的返回值为，当第一个数据小于第二个时返回负整数，两个数据相等时返回0，其他情况返回正整数。

所以，我们例子所需要的回调函数形参表是两个整型指针，它返回一个整数。首先我们用工厂函数创建回调函数的类型：

>>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))

>>>

在真正实现回调函数之前，我们简单打印获取到的参数，然后返回0（一步一步来;-）

>>> def py_cmp_func(a, b):

... print "py_cmp_func", a, b

... return 0

...

>>>

创建C回调函数：

>>> cmp_func = CMPFUNC(py_cmp_func)

>>>

然后运行一下：

>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS

py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...>

>>>

我们已经知道怎么访问指针指向的内容了，所以让我们重新定义一下回调函数：

>>> def py_cmp_func(a, b):

... print "py_cmp_func", a[0], b[0]

... return 0

...

>>> cmp_func = CMPFUNC(py_cmp_func)

>>>

这是我们在Windows上跑到的结果：

>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS

py_cmp_func 7 1

py_cmp_func 33 1

py_cmp_func 99 1

py_cmp_func 5 1

py_cmp_func 7 5

py_cmp_func 33 5

py_cmp_func 99 5

py_cmp_func 7 99

py_cmp_func 33 99

py_cmp_func 7 33

>>>

有趣的是，在linux上排序函数运行更高效，它仅需要更少的比较的次数：

>>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +LINUX

py_cmp_func 5 1

py_cmp_func 33 99

py_cmp_func 7 33

py_cmp_func 5 7

py_cmp_func 1 7

>>>

嗯，我们将要完成了！最后一步是要真正去对两个数据进行比较并且返回一个有用的结果：

>>> def py_cmp_func(a, b):

... print "py_cmp_func", a[0], b[0]

... return a[0] - b[0]

...

>>>

最后在Windows上运行的结果：

>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +WINDOWS

py_cmp_func 33 7

py_cmp_func 99 33

py_cmp_func 5 99

py_cmp_func 1 99

py_cmp_func 33 7

py_cmp_func 1 33

py_cmp_func 5 33

py_cmp_func 5 7

py_cmp_func 1 7

py_cmp_func 5 1

>>>

然后下面是linux上的结果:

>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +LINUX

py_cmp_func 5 1

py_cmp_func 33 99

py_cmp_func 7 33

py_cmp_func 1 7

py_cmp_func 5 7

>>>

很有趣地看到，Windows的快排比在linux版本的快排比较的次数多！

我们可以简单检查一下排序结果：

>>> for i in ia: print i,

...

1 5 7 33 99

>>>

回调函数的重要提示：

确保你在C代码的使用生命周期里保持引用CFUNCTYPE对象。ctypes并不会帮你做这样的事情，如果你没有做保证，它们就会被垃圾回收，然后当你调用这个回调函数时将会导致程序崩溃。

2.18 访问动态链接库导出的值

有时候，一个动态链接库不仅提供了函数，它还提供了变量。一个例子是，在Python自身库里使用了Py_OptimizeFlag标志变量，这个整型变量被设置为0，1，或者2，它依赖于python在启动时指定的-O或者-OO标志。

ctypes可以这样使用in_dll的类方法访问变量值。pythonapi是一个预定义符号可以访问Python C api：

>>> opt_flag = c_int.in_dll(pythonapi, "Py_OptimizeFlag")

>>> print opt_flag

c_long(0)

>>>

如果解析器使用-O命令启动，例子就会打印c_long(1)，或者c_long(2)如果指定-OO参数。

Python的导出指针PyImport_FrozenModules也是一个扩展的例子展示指针的访问使用办法。

根据Python docs文档：这个指针初始化指向一组”struct _frozen`”记录，以一个成员全部都是NULL或者0作为结束标志。当导入一个静态模块，它就会在这张表里面搜索。第三方代码可以利用此技巧提供一个静态模块的动态创建集合。

所以熟悉这个指针证明还是挺有用的。为了限制例子的大小，我们仅展示这个表如果通过ctypes来访问。

>>> from ctypes import *

>>>

>>> class struct_frozen(Structure):

... _fields_ = [("name", c_char_p),

... ("code", POINTER(c_ubyte)),

... ("size", c_int)]

...

>>>

我们已经定义struct_frozen的数据结构类型，所以我们可以获得指向这张表的指针：

>>> FrozenTable = POINTER(struct_frozen)

>>> table = FrozenTable.in_dll(pythonapi, "PyImport_FrozenModules")

>>>

由于此表是一个struct_frozen记录的数据指针，所以我们可以迭代遍历它，不过我们必须保证结束我们的循环，因为此指针没有指明大小。迟早它会因为非法访问而导致崩溃，所以当我们访问到NULL实体时，最好结束循环：

>>> for item in table:

... print item.name, item.size

... if item.name is None:

... break

...

__hello__ 104

__phello__ -104

__phello__.spam 104

None 0

>>>

事实上，标准Python有一个并不怎么出名的静态模块和一个静态包（相对于其他成员来说），它仅用于测试。试试用import __hello__吧。

2.19 意料之外

在ctypes，有些坑可能你没想到。

思考下面的例子：

>>> from ctypes import *

>>> class POINT(Structure):

... _fields_ = ("x", c_int), ("y", c_int)

...

>>> class RECT(Structure):

... _fields_ = ("a", POINT), ("b", POINT)

...

>>> p1 = POINT(1, 2)

>>> p2 = POINT(3, 4)

>>> rc = RECT(p1, p2)

>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y

1 2 3 4

>>> # now swap the two points

>>> rc.a, rc.b = rc.b, rc.a

>>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y

3 4 3 4

>>>

嗯，我们当然期望最后一名打印3 4 1 2。到底发生了什么事？这里是上面rc.a, rc.b = rc.b, rc.a这一行的步骤：

>>> temp0, temp1 = rc.b, rc.a

>>> rc.a = temp0

>>> rc.b = temp1

>>>

注意，temp0和temp1都是使用了上述rc对象的内部缓存块对象。所以当执行rc.a = temp0时，拷贝了temp0的缓冲内容给rc的缓冲。依次地，又改为temp1的内容。所以最后一句rc.b = temp1并没有想像中那样的效果。

记住，检索结构体，联合体及数组并不是使用它们的拷贝，而是检索一个访问顶级对象相关缓冲区的封装对象。

另外一个意想不到的例子是：

>>> s = c_char_p()

>>> s.value = "abc def ghi"

>>> s.value

'abc def ghi'

>>> s.value is s.value

False

>>>

为什么这里打印False？ctypes实例是一些包含内存块加上一些内容内存访问描述信息的对象。存储一个Python对象在内存块并不是存储对象本身，取而代之存储的是对象的内容。每次访问内容时都会构造一个新的Python对象！

2.20 可变大小的数据类型

ctypes提供了可变数组与结构体的支持（在0.9.9.7版本增加）。

resize函数可以调整一个已经存在的ctypes对象的内存缓冲大小。这个函数以ctypes对象为第一个参数，以需要调整后的字节大小为第二个参数。重新调整的内存块大小不能小于原生对象类型的内存块大小，若你这么做，则会抛出ValueError：

>>> short_array = (c_short * 4)()

>>> print sizeof(short_array)

>>> resize(short_array, 4)

Traceback (most recent call last):

...

ValueError: minimum size is 8

>>> resize(short_array, 32)

>>> sizeof(short_array)

>>> sizeof(type(short_array))

>>>

这看起来不错，但怎么访问这个数据增加的元素呢？由于type方法仍然只知道有4个元素，当访问其他元素时我们会得到错误：

>>> short_array[:]

[0, 0, 0, 0]

>>> short_array[7]

Traceback (most recent call last):

...

IndexError: invalid index

>>>

ctypes中另外一种使用可变数据类型的方法是使用Python的动态语言特性，具体问题具体分析，当已经知道需要的数据大小时，才（重）定义数据类型。

2.21 bug, todo和未完成的东西

没有实现枚举类型。你自己使用c_int作为基类就可以简单实现它。

没有实现long double类型。

3 其他感兴趣的话题

3.1 研究各种python封装c扩展方法的性能

写python的c扩展方法有很多，除了ctypes之外，还有原始的python c api、swig、sip、cython、cffi等方法，研究各种封装写法的性能差异有利于程序的性能优化。

3.2 利用PyImport_FrozenModules做一些有趣的事情

将py文件编译成pyc文件在一定程度上可以保护你的源代码，但其实并不安全，因为利用Python标准库可以将其反编译成py源代码。而将py源码模块编译成frozen模块则可以有效保护你的源代码。但要导入frozen模块则需要利用PyImport_FrozenModules来管理，因此可以利用ctypes+PyImport_FrozenModules做一个frozen模块的动态创建集合管理。

4 参考资料

1. http://starship.python.net/crew/theller/ctypes/tutorial.html#bugs-todo-and-non-implemented-things

本文档翻译的文章。

2. http://gashero.iteye.com/blog/519837

本文档主要借鉴参考的文章。

3. http://www.isnowfy.com/introduction-to-python-c-extension/

简单列举了各种python的c扩展办法。

4. http://blog.csdn.net/linda1000/article/details/12623527

5. http://blog.waterlin.org/articles/using-python-ctypes-to-link-cpp-library.html

6. https://docs.python.org/2/c-api/import.html#PyImport_FrozenModules

Frozen modules相关文档。

0 0