python 源码学习心得

来源:互联网 发布:淘宝网妈妈斗篷 编辑:程序博客网 时间:2024/04/29 14:15

先好好学习《python源码剖析》

python对象 Pyobject

object.h 中

typedef Py_ssize_t(*lenfunc)(PyObject*);

$$$$$$$$$$$$$$$$$$$$$$$$$$$

#define PyObject_HEAD

   _PyObject_HEAD_EXTRA

   Py_siize_t_ob_refcnt;

   struc _typeobject *ob_type;

$$$$$$$$$$$$$$$$$$$$$$$$$$$


#_PyObject_HEAD_EXTRA


Python 的 C API 接口

python 通过Pyobject 和PyTypeObject, 用C写的基本结构 实现了C++所提供的对象的多态性的特征。

引用计数减为0时,该对象的析构函数会被调用,但不一定是调用free 释放内存,一般来说,python中大量采用了内存对象池的技术,析构的时候,通常是将对象占用的空间归还到内存池中。

#define Py_DECREF(op)                                   \
    do {                                                \
        if (_Py_DEC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
        --((PyObject*)(op))->ob_refcnt != 0)            \
            _Py_CHECK_REFCNT(op)                        \
        else                                            \
        _Py_Dealloc((PyObject *)(op));                  \
    } while (0)

_Py_Dealloc(PyObject *op)
{
    destructor dealloc = Py_TYPE(op)->tp_dealloc;
    _Py_ForgetReference(op);
    (*dealloc)(op);
}

void
_Py_ForgetReference(register PyObject *op)
{
#ifdef SLOW_UNREF_CHECK
    register PyObject *p;
#endif
    if (op->ob_refcnt < 0)
        Py_FatalError("UNREF negative refcnt");
    if (op == &refchain ||
        op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op)
        Py_FatalError("UNREF invalid object");
#ifdef SLOW_UNREF_CHECK
    for (p = refchain._ob_next; p != &refchain; p = p->_ob_next) {
        if (p == op)
            break;
    }
    if (p == &refchain) /* Not found */
        Py_FatalError("UNREF unknown object");
#endif
    op->_ob_next->_ob_prev = op->_ob_prev;
    op->_ob_prev->_ob_next = op->_ob_next;
    op->_ob_next = op->_ob_prev = NULL;
    _Py_INC_TPFREES(op);
}


python的分类

python中的整数

整数对象池,几乎所有的内建对象,都会有自己的对象池机制。

宏:PyInt_AS_LONG,函数PyInt_AsLong

宏版本可以省去一次函数调用,但牺牲了类型的安全,因为其参数op可以不是一个PyIntObject对象。这在用C编写python扩展的时候要注意类型。

函数版本则有类型检查,这是以执行效率为代价。


python 中创建一个对象可以通过python暴露的C API,也可以通过类型对象完成创建动作。即使是通过内建类型对象中tp_new,tp_init操作创建实例对象,实际上还是会调用python为特定内建对象准备的C API。

intobject.c中定义的小整数池

#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS           257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS           5
#endif
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
/* References to small integers are saved in this array so that they
   can be shared.
   The integers that are saved are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
#endif

大整数对象 使用的PyintBlock 块结构,实现一个单向类列表。

/* Integers are quite normal objects, to make object handling uniform.
   (Using odd pointers to represent integers would save much space
   but require extra checks for this special case throughout the code.)
   Since a typical Python program spends much of its time allocating
   and deallocating integers, these operations should be very fast.
   Therefore we use a dedicated allocation scheme with a much lower
   overhead (in space and time) than straight malloc(): a simple
   dedicated free list, filled when necessary with memory from malloc().

   block_list is a singly-linked list of all PyIntBlocks ever allocated,
   linked via their next members.  PyIntBlocks are never returned to the
   system before shutdown (PyInt_Fini).

   free_list is a singly-linked list of available PyIntObjects, linked
   via abuse of their ob_type members.
*/

#define BLOCK_SIZE      1000    /* 1K less typical malloc overhead */
#define BHEAD_SIZE      8       /* Enough for a 64-bit pointer */
#define N_INTOBJECTS    ((BLOCK_SIZE - BHEAD_SIZE) / sizeof(PyIntObject))

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];
};

typedef struct _intblock PyIntBlock;

static PyIntBlock *block_list = NULL;
static PyIntObject *free_list = NULL;


python中的字符串对象

typedef struct {
    PyObject_VAR_HEAD
    long ob_shash;
    int ob_sstate;
    char ob_sval[1];

    /* Invariants:
     *     ob_sval contains space for 'ob_size+1' elements.
     *     ob_sval[ob_size] == 0.
     *     ob_shash is the hash of the string or -1 if not computed yet.
     *     ob_sstate != 0 iff the string object is in stringobject.c's
     *       'interned' dictionary; in this case the two references
     *       from 'interned' to this object are *not counted* in ob_refcnt.
     */
} PyStringObject;

字符串对象的intern机制

intern机制的关键就是创建了一个interned 字典,用于处理已有的字符对象。

引用计算特别,interned中的对象引用保存不变。

字符缓冲池:

static PyStringObject *characters[UCHAR_MAX+1]

其中UCHAR_MAX 是一个跟平台相关的变量

win32 中 #define UNCHAR_MAX 0xff

利用PyStringObject对象的join操作来对存储在list或tuple的一组Pystringobject 对象进行连接操作,这样的做法只要分配一次内存,执行效率大大提高。


Python中的list对象

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

看待PyListObject:vector<PyObject*>

这里的ob_size 和allocated的关系就像C++的verctor中的size 和capacity的关系。

PyListObject对象的创建与维护

PyList_New



0 0
原创粉丝点击