python 初步认识弱引用 & 垃圾回收

来源:互联网 发布:知名淘宝客 编辑:程序博客网 时间:2024/05/29 13:30
1、
在看python多进程共享内存空间,也就是multiprocessing.sharedctypes的Value源码的时候,首次接触弱引用:weakref。。。
文档1:https://www.rddoc.com/doc/Python-3.6.0/library/weakref/
文档2:https://docs.python.org/2/library/weakref.html?highlight=weakref
文档3:http://python.jobbole.com/85431/
大概用法:
ref = weakref.ref(obj)  # 创建一个弱引用
ref2 = weakref.proxy(obj)  # 创建一个弱引用的代理,感觉和ref()很相似,除了不能被hash
 
2、
来看一个普通的弱引用例子(例子是我在网上抄的,哈哈哈):
# -*- coding: utf-8 -*-import sysimport weakrefclass Man(object):    def __init__(self,name):        self.name = nameo = Man('Jim')print "when create, o's refcount=", sys.getrefcount(o)r = weakref.ref(o)  # 创建一个弱引用print "after weakref, o's refcount=",sys.getrefcount(o)  # 引用计数并没有改变o2 = r()  # 获取弱引用所指向的对象print "o is o2 =", o is o2print "after o2 point to o, o's refcount=", sys.getrefcount(o)o3 = o  # 如果不是弱引用,就会+1print "o is o3 =", o is o3print "after o3 point to o, o's refcount=", sys.getrefcount(o)o = Noneo2 = Noneo3= Noneprint "after o=None, o2=None, o3=None, weakref=", r
结果:
when create, o's refcount= 2after weakref, o's refcount= 2o is o2 = Trueafter o2 point to o, o's refcount= 3o is o3 = Trueafter o3 point to o, o's refcount= 4after o=None, and o2=None, weakref= <weakref at 0x10e34a8e8; dead>
我们可以看到:
o2虽然也指向了o所指向的那个对象,但是并没有让引用计数=1;
o3指向o所指向的那个对象,引用计数+1。
然后o所指向的对象x没有引用之后,我们指向x的弱引用ref会是一个dead状态。

3、
话说在用sys.getrefcount(),发现引用数不对呀,总是比实际的大1,这是怎么搞的?
原因如下:
def getrefcount(p_object): # real signature unknown; restored from __doc__    """    getrefcount(object) -> integer        Return the reference count of object.  The count returned is generally    one higher than you might expect, because it includes the (temporary)    reference as an argument to getrefcount().    """    return 0

4、
那么道理我大概懂了,弱引用有什么实际的例子呢,这个弱引用和垃圾回收有很大的关系。毕竟我们python的平时开发很少需要考虑垃圾回收机制呀。
先看看python的垃圾回收gc模块:
gc.set_debug(gc.DEBUG_LEAK) 是打开详细日志! To debug a leaking program call!
gc.collect() 收集垃圾。
gc.garbage  一个list,里面放着不可到达,又不能收集的对象。
而且我们知道一个常识,如果一个对象的被引用次数=0,则会被垃圾回收。

5、
这里有一篇弱引用的实际应用:http://sleepd.blog.51cto.com/3034090/1073044
大概就是如果项目用到了循环引用,那么就用弱引用,对象才能顺利被垃圾回收。
import sysimport gcfrom pprint import pprintclass Graph(object):    def __init__(self, name):        self.name = name        self.other = None    def set_next(self, other):        print "%s.set_next(%r)" % (self.name, other)        self.other = other    def all_nodes(self):        yield self        n = self.other        while n and n.name !=self.name:            yield n            n = n.other        if n is self:            yield n        return    def __str__(self):        return "->".join(n.name for n in self.all_nodes())    def __repr__(self):        return "<%s at 0x%x name=%s>" % (self.__class__.__name__, id(self), self.name)    def __del__(self):        print "(Deleting %s)" % self.namedef collect_and_show_garbage():    print "Collecting..."    n = gc.collect()    print "unreachable objects:", n    print "garbage:",    pprint(gc.garbage)def demo(graph_factory):    print "Set up graph:"    one = graph_factory("one")    two = graph_factory("two")    three = graph_factory("three")    one.set_next(two)    two.set_next(three)    three.set_next(one)    print    print "Graph:"    print str(one)    collect_and_show_garbage()    print "one, getrefcount = ", sys.getrefcount(one)    print "two, getrefcount = ", sys.getrefcount(two)    print "three, getrefcount = ", sys.getrefcount(three)    print    three = None    two = None    print "After 2 references removed"    print str(one)    collect_and_show_garbage()    print "one, getrefcount = ", sys.getrefcount(one)    print "two, getrefcount = ", sys.getrefcount(two)    print "three, getrefcount = ", sys.getrefcount(three)    print    print "removeing last reference"    one = None    collect_and_show_garbage()    print "one, getrefcount = ", sys.getrefcount(one)    print "two, getrefcount = ", sys.getrefcount(two)    print "three, getrefcount = ", sys.getrefcount(three)if __name__ == '__main__':    gc.set_debug(gc.DEBUG_LEAK)    print "Setting up the cycle"    print    demo(Graph)
结果:用了gc.set_debug(gc.DEBUG_LEAK)
Setting up the cycleSet up graph:one.set_next(<Graph at 0x103b35e10 name=two>)two.set_next(<Graph at 0x103b35e50 name=three>)three.set_next(<Graph at 0x103b35dd0 name=one>)Graph:one->two->three->oneCollecting...unreachable objects: 0garbage:[]one, getrefcount =  3two, getrefcount =  3three, getrefcount =  3After 2 references removedone->two->three->oneCollecting...unreachable objects: 0garbage:[]one, getrefcount =  3two, getrefcount =  886three, getrefcount =  886removeing last referenceCollecting...gc: uncollectable <Graph 0x103b35dd0>gc: uncollectable <Graph 0x103b35e10>gc: uncollectable <Graph 0x103b35e50>gc: uncollectable <dict 0x103b33e88>gc: uncollectable <dict 0x103b3a398>gc: uncollectable <dict 0x103b3bc58>unreachable objects: 6garbage:[<Graph at 0x103b35dd0 name=one>,<Graph at 0x103b35e10 name=two>,<Graph at 0x103b35e50 name=three>,{'name': 'one', 'other': <Graph at 0x103b35e10 name=two>},{'name': 'two', 'other': <Graph at 0x103b35e50 name=three>},{'name': 'three', 'other': <Graph at 0x103b35dd0 name=one>}]one, getrefcount =  887two, getrefcount =  887three, getrefcount =  887
我们让one, two, three这三个对象接连引用成为一个环。然后即使我们把one, two, three这三个都弄成None,就变成了不可达对象,但是因为循环引用,导致它们的被引用数>0,满足了及不可达又不可回收,就会出现在gc.garbage列表。话说没人关心为啥子sys.getrefcount(one)会出现这个大数字:886,887吗?啊,其实是因为这句代码相当于sys.getrefcount(None),而None在python的其他地方中可是被人引用很多次了的这个能理解吧,不懂的可以继续看这篇Stack Overflow:
https://stackoverflow.com/questions/44096794/why-does-sys-getrefcount-give-huge-values

6、
好了,针对5中的demo的解决方案如下:弱引用啊!!!
import gcfrom pprint import pprintimport weakrefclass Graph(object):    def __init__(self, name):        self.name = name        self.other = None    def set_next(self, other):        print "%s.set_next(%r)" % (self.name, other)        if other is not None:            if self in other.all_nodes():  ######修改处!! 在形成环的地方,采用弱引用!                other = weakref.proxy(other)        self.other = other    def all_nodes(self):        yield self        n = self.other        while n and n.name != self.name:            yield n            n = n.other        if n is self:            yield n        return    def __str__(self):        return "->".join(n.name for n in self.all_nodes())    def __repr__(self):        return "<%s at 0x%x name=%s>" % (self.__class__.__name__, id(self), self.name)    def __del__(self):        print "(Deleting %s)" % self.namedef collect_and_show_garbage():    print "Collecting..."    n = gc.collect()    print "unreachable objects:", n    print "garbage:",    pprint(gc.garbage)def demo(graph_factory):    print "Set up graph:"    one = graph_factory("one")    two = graph_factory("two")    three = graph_factory("three")    one.set_next(two)    two.set_next(three)    three.set_next(one)    print    print "Graph:"    print str(one)    collect_and_show_garbage()    print    three = None    two = None    print "After 2 references removed"    print str(one)    collect_and_show_garbage()    print    print "removeing last reference"    one = None    collect_and_show_garbage()if __name__ == '__main__':    gc.set_debug(gc.DEBUG_LEAK)    print "Setting up the cycle"    print    demo(Graph)
结果:
Setting up the cycleSet up graph:one.set_next(<Graph at 0x101f5fd90 name=two>)two.set_next(<Graph at 0x101f5fdd0 name=three>)three.set_next(<Graph at 0x101f5fd50 name=one>)Graph:one->two->threeCollecting...unreachable objects: 0garbage:[]After 2 references removedone->two->threeCollecting...unreachable objects: 0garbage:[]removeing last reference(Deleting one)(Deleting two)(Deleting three)Collecting...unreachable objects: 0garbage:[]
好了,只修改了set_next()处的逻辑,在形成环的地方,采用弱引用(当然也可以全部弱引用)。
gc顺利回收全部对象。

以上
 
原创粉丝点击