python __slots__ 使你的代码更加节省内存
来源:互联网 发布:软件产品质量验收标准 编辑:程序博客网 时间:2024/05/21 15:48
因此这种默认的做法可以通过在新式类中定义了一个__slots__属性从而得到了解决。__slots__声明中包含若干实例变量,并为每个实例预留恰好足够的空间来保存每个变量,因此没有为每个实例都创建一个字典,从而节省空间。
现在来说说python中dict为什么比list浪费内存?
和list相比,dict 查找和插入的速度极快,不会随着key的增加而增加;dict需要占用大量的内存,内存浪费多。
而list查找和插入的时间随着元素的增加而增加;占用空间小,浪费的内存很少。
python解释器是Cpython,这两个数据结构应该对应C的哈希表和数组。因为哈希表需要额外内存记录映射关系,而数组只需要通过索引就能计算出下一个节点的位置,所以哈希表占用的内存比数组大,也就是dict比list占用的内存更大。
如果想更加详细了解,可以查看C的源代码。python官方链接:https://www.python.org/downloads/source/
如下代码是我从python官方截取的代码片段:
List 源码:
typedef struct {
PyObject_VAR_HEAD
/
*
Vector of pointers to
list
elements.
list
[
0
]
is
ob_item[
0
], etc.
*
/
PyObject
*
*
ob_item;
/
*
ob_item contains space
for
'allocated'
elements. The number
*
currently
in
use
is
ob_size.
*
Invariants:
*
0
<
=
ob_size <
=
allocated
*
len
(
list
)
=
=
ob_size
*
ob_item
=
=
NULL implies ob_size
=
=
allocated
=
=
0
*
list
.sort() temporarily sets allocated to
-
1
to detect mutations.
*
*
Items must normally
not
be NULL,
except
during construction when
*
the
list
is
not
yet visible outside the function that builds it.
*
/
Py_ssize_t allocated;
} PyListObject;
Dict源码:
/
*
PyDict_MINSIZE
is
the minimum size of a dictionary. This many slots are
*
allocated directly
in
the
dict
object
(
in
the ma_smalltable member).
*
It must be a power of
2
,
and
at least
4.
8
allows dicts with no more
*
than
5
active entries to live
in
ma_smalltable (
and
so avoid an
*
additional malloc); instrumentation suggested this suffices
for
the
*
majority of dicts (consisting mostly of usually
-
small instance dicts
and
*
usually
-
small dicts created to
pass
keyword arguments).
*
/
#define PyDict_MINSIZE 8
typedef struct {
/
*
Cached
hash
code of me_key. Note that
hash
codes are C longs.
*
We have to use Py_ssize_t instead because dict_popitem() abuses
*
me_hash to hold a search finger.
*
/
Py_ssize_t me_hash;
PyObject
*
me_key;
PyObject
*
me_value;
} PyDictEntry;
/
*
To ensure the lookup algorithm terminates, there must be at least one Unused
slot (NULL key)
in
the table.
The value ma_fill
is
the number of non
-
NULL keys (
sum
of Active
and
Dummy);
ma_used
is
the number of non
-
NULL, non
-
dummy keys (
=
=
the number of non
-
NULL
values
=
=
the number of Active items).
To avoid slowing down lookups on a near
-
full table, we resize the table when
it's two
-
thirds full.
*
/
typedef struct _dictobject PyDictObject;
struct _dictobject {
PyObject_HEAD
Py_ssize_t ma_fill;
/
*
# Active + # Dummy */
Py_ssize_t ma_used;
/
*
# Active */
/
*
The table contains ma_mask
+
1
slots,
and
that's a power of
2.
*
We store the mask instead of the size because the mask
is
more
*
frequently needed.
*
/
Py_ssize_t ma_mask;
/
*
ma_table points to ma_smalltable
for
small tables,
else
to
*
additional malloc'ed memory. ma_table
is
never NULL! This rule
*
saves repeated runtime null
-
tests
in
the workhorse getitem
and
*
setitem calls.
*
/
PyDictEntry
*
ma_table;
PyDictEntry
*
(
*
ma_lookup)(PyDictObject
*
mp, PyObject
*
key,
long
hash
);
PyDictEntry ma_smalltable[PyDict_MINSIZE];
};
PyObject_HEAD 源码:
#ifdef Py_TRACE_REFS
/
*
Define pointers to support a doubly
-
linked
list
of
all
live heap objects.
*
/
#define _PyObject_HEAD_EXTRA \
struct _object
*
_ob_next; \
struct _object
*
_ob_prev;
#define _PyObject_EXTRA_INIT 0, 0,
#else
#define _PyObject_HEAD_EXTRA
#define _PyObject_EXTRA_INIT
#endif
/
*
PyObject_HEAD defines the initial segment of every PyObject.
*
/
#define PyObject_HEAD \
_PyObject_HEAD_EXTRA \
Py_ssize_t ob_refcnt; \
struct _typeobject
*
ob_type;
PyObject_VAR_HEAD 源码:
/
*
PyObject_VAR_HEAD defines the initial segment of
all
variable
-
size
*
container objects. These end with a declaration of an array with
1
*
element, but enough space
is
malloc'ed so that the array actually
*
has room
for
ob_size elements. Note that ob_size
is
an element count,
*
not
necessarily a byte count.
*
/
#define PyObject_VAR_HEAD \
PyObject_HEAD \
Py_ssize_t ob_size;
/
*
Number of items
in
variable part
*
/
现在知道了dict为什么比list 占用的内存空间更大。接下来如何让你的类更加的节省内存。
其实有两种解决方案:
第一种是使用__slots__ ;另外一种是使用Collection.namedtuple 实现。
首先用标准的方式写一个类:
#!/usr/bin/env python
class
Foobar(
object
):
def
__init__(
self
, x):
self
.x
=
x
@profile
def
main():
f
=
[Foobar(
42
)
for
i
in
range
(
1000000
)]
if
__name__
=
=
"__main__"
:
main()
然后,创建一个类Foobar(),然后实例化100W次。通过@profile查看内存使用情况。
运行结果:
该代码共使用了372M内存。
接下来通过__slots__代码实现该代码:
#!/usr/bin/env python
class
Foobar(
object
):
__slots__
=
'x'
def
__init__(
self
, x):
self
.x
=
x
@profile
def
main():
f
=
[Foobar(
42
)
for
i
in
range
(
1000000
)]
if
__name__
=
=
"__main__"
:
main()
运行结果:
使用__slots__使用了91M内存,比使用__dict__存储属性值节省了4倍。
其实使用collection模块的namedtuple也可以实现__slots__相同的功能。namedtuple其实就是继承自tuple,同时也因为__slots__的值被设置成了一个空tuple以避免创建__dict__。
看看collection是如何实现的:
collection 和普通创建类方式相比,也节省了不少的内存。所在在确定类的属性值固定的情况下,可以使用__slots__方式对内存进行优化。但是这项技术不应该被滥用于静态类或者其他类似场合,那不是python程序的精神所在。
本文出自 “David” 博客,请务必保留此出处http://davidbj.blog.51cto.com/4159484/1677587
- python __slots__ 使你的代码更加节省内存
- 用Python的 __slots__ 节省9G内存
- 通过Python的__slots__节省9GB内存
- Python __slots__限制动态变量个数,节省内存
- __slots__为什么能节省内存?
- 节省你的内存
- 让你的Python代码更加pythonic
- 让你的Python代码更加pythonic
- 让你的python代码更加pythonic(简练、明确、优雅)
- 使用with关键字让你的Python代码更加Pythonic
- python的__slots__
- python的__slots__
- Python的__slots__
- Python的__slots__
- Python __slots__的作用
- python的__slots__
- 用assertions使你的代码更加稳定
- 使你的代码变得更加美观,易读
- 微信支付集成简介
- linux怎样使用top命令查看系统状态
- 原生JS实现漂浮广告的一种方法(Demo)
- Ajax-load
- HTML5基础加强css样式篇(css过度效果:界面的浮层遮罩和消失)(十九)
- python __slots__ 使你的代码更加节省内存
- buildroot认知
- VS2012+PCL1.7.2+OpenNI
- jsp报错:Multiple annotations found at this line
- heap_init
- poj 1006 Biorhythms
- Springmvc中web.xml的配置详解
- python爬虫(9)获取动态搞笑图片
- Java经典算法四十例编程详解+程序实例