(Django)对object.all()等大量数据的QuerySet限制内存使用

来源:互联网 发布:淘宝举报假冒商品 编辑:程序博客网 时间:2024/06/03 18:10

问题

在django的使用中,经常会出现大量数据的遍历操作,或者是对大量数据进行遍历迁移跟新,比如

for user in User.objects.all():    user.A = user.B    User.B = None

等种种情况。

在本地开发环境中QuerySet对象最初具有非常小的内存占用,随着业务量的增长QuerySet对象在我遍历它们时缓存每个model_instance,all()返回的QuerySet会越来越来,可能最终耗尽内存,被托管服务提供商杀死线程。

解决方法:

import copyfrom decimal import Decimalclass MemorySavingQuerysetIterator(object):    def __init__(self, queryset, max_obj_num=1000):        self._base_queryset = queryset        self._generator = self._setup()        self.max_obj_num = max_obj_num    def _setup(self):        for i in xrange(0, self._base_queryset.count(), self.max_obj_num):            # By making a copy of of the queryset and using that to actually            # access the objects we ensure that there are only `max_obj_num`            # objects in memory at any given time            smaller_queryset = copy.deepcopy(self._base_queryset                                             )[i:i + self.max_obj_num]            # logger.debug('Grabbing next %s objects from DB' % self.max_obj_num)            for obj in smaller_queryset.iterator():                yield obj    def __iter__(self):        return self    def next(self):        return self._generator.next()

调用:

Users = User.objects.all()for user in MemorySavingQuerysetIterator(users, 100):    Pass

python mysql原生操作

import MySQLdbclass QuerySetIterator(object):    def __init__(self, cursor, query, max_num):        self.query = query        self.max_num = max_num        self._cursor = cursor        self._generator = self._setup()    def _setup(self):        for i in xrange(0, 90000000, self.max_num):            new_query = "{query} limit {limit} offset {offset}".format(                query=self.query, limit=self.max_num, offset=i            )            self._cursor.execute(new_query)            result = self._cursor.fetchall()            if not result:                break            for obj in result:                yield obj    def __iter__(self):        return self    def next(self):        return self._generator.next()class TestModel(object):    db = MySQLdb.connect("localhost", "root", "123456", "test")    cursor = db.cursor()    def __init__(self, tb_name, max_num=100):        self.tb_name = tb_name        self.max_num = max_num        self._query_sql_tpl = "select * from {tb_name}".format(tb_name=tb_name)    def query_all(self, query_sql=None):        if not query_sql:            query_sql = self._query_sql_tpl        return QuerySet(self.cursor, query_sql, self.max_num)test = TestModel('test')result = test.query_all()for obj in result:    print obj
阅读全文
0 0
原创粉丝点击