python-2-1 如何在列表, 字典, 集合中根据条件筛选数据-列表解析-filter

来源：互联网发布：marksimos知乎编辑：程序博客网时间：2024/04/29 05:28

2-1 如何在列表, 字典, 集合中根据条件筛选数据

预备知识:
本节中我们会用到randint,lambda,timeit,filter等关键字
通常做法是迭代遍历当前的列表，然后再将列表中满足条件的元素存在另外一个列表中
from random import randint
randint(a,b) Return random integer in range [a, b], including both end points.

timeit函数的用法 timeit.timeit(‘func()’,’func所在的域’,func要执行多少次，默认执行一百万次)

from timeit import timeitprint timeit('filter(lambda x: x > 0, [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2])',number=1)print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]',number=1)print timeit('filter(lambda x: x > 0, datalist)',setup='from __main__ import datalist',number=1)print timeit('[x for x in datalist if x >0 ]',setup='from __main__ import datalist',number=1)t3=Timer("test3()","from __main__ import test3")print t3.timeit(1000000)或者print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]')

解决方案:

列表:

randint是一个两边都是闭区间的函数rand(-10,10)代表产生的随机数在-10和10之间，其中包括 -10,10两个端点生成随机列表 [ randint(-10,10) for _ in range(10)]方法一:列表迭代res=[]for x in datalist:    if x >= 0:        res.append(x)方法二：列表解析 [expr for iter_var in iterable] 关键在于这个for，迭代iterable对象的所有条目，前边的expr应用于序列的每个成员，最后结果是该表达式产生的列表[x for x in datalist if x>=0][(x ** 2) for x in xrange(6)] ===>map(lambda x :x **2 ,xrange(6)][expr for iter_var in iterable if conf_expr] #可以支持多重for 循环 多重 if 语句(expr for iter_var in iterable if conf_expr) #生成器表达式使用生成器可以节省内存f = open('test.txt','r')len([word for line in f for word in line.split()])   sum(len(word) for line in f for word in line.split())     max(len(x.strip()) for x in open('test.txt'))方法三：filter函数  filter(...)  #filter函数中的函数的返回类型是bool型的    filter(function or None, sequence) -> list, tuple, or string    Return those items of sequence for which function(item) is true.  If    function is None, return the items that are true.  If sequence is a tuple    or string, return the same type, else return a list.def func(x):    if x > 't':        return xfilter(lambda x: x>=0,datalist)filter(lambda x:x >='t','strxxx')filter(None,'strxxx')filter(func,'strxxx')filter(lambda x : x >4,tuple1)help(map) Help on built-in function map in module __builtin__:map(...) map用来快速生成一个列表,函数中的function 是个表达式,对后面给定的列表进行一定的运算,如果碰到后面有几组列表传进来，map会试着去将这个几个seq 组合起来    map(function, sequence[, sequence, ...]) -> list    Return a list of the results of applying the function to the items of    the argument sequence(s).  If more than one sequence is given, the    function is called with an argument list consisting of the corresponding    item of each sequence, substituting None for missing values when not all    sequences have the same length.  If the function is None, return a list of    the items of the sequence (or a list of tuples if more than one sequence).map(lambda x,y,z: str(x) + str(y) +str(z),('xxx','yyyzzz'),('123','456'),('abc','def'))==>['xxx123abc', 'yyyzzz456def']==>str1 = map(lambda h: h.replace(' ',''),str1)str1 =["aa","bb","c c","d d","e e"]str1 = map(lambda h: h.replace(' ',''),str1)print str1['aa', 'bb', 'cc', 'dd', 'ee']>>> help(reduce)Help on built-in function reduce in module __builtin__:reduce(...) #reduce从sequence中取出两个元素，把这个两个元素作为结果1，再取第三个元素，结果1和第三个元素 会得出结果2,如此迭代完列表中所有的元素    reduce(function, sequence[, initial]) -> value    Apply a function of two arguments cumulatively to the items of a sequence,    from left to right, so as to reduce the sequence to a single value.    For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items    of the sequence in the calculation, and serves as a default when the    sequence is empty.>>> reduce(lambda a,b: a&b,map(dict.viewkeys,[dict1,dict2,dict3]))  #取出三个字典中的key的交集，注意这里将map也用进来了print reduce(func,map(dict.viewkeys,[dict1,dict2,dict3]))print reduce(lambda x,y:x + y,[1,2,3,4,5])   #reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5)

import timeimport timeitfrom random import randint#通过迭代遍历列表的方式 datalist = [ randint(-10,10)  for _ in xrange(10) ]res=[]for x in datalist:    if x >=0:        res.append(x)print res#filter functionres2 = filter(lambda x: x > 0,datalist)print res2#list 列表解析print [x for x in datalist if x >0 ]#匿名函数和列表解析的比较 列表解析速度更快#timeit函数的用法from timeit import timeitprint timeit('filter(lambda x: x > 0, [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2])',number=1)print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]',number=1)print timeit('filter(lambda x: x > 0, datalist)',setup='from __main__ import datalist',number=1)print timeit('[x for x in datalist if x >0 ]',setup='from __main__ import datalist',number=1)'''字典过滤'''dict1={x:randint(60,100) for x in xrange(20014540,20014550)}print dict1print {k:v for k,v in dict1.iteritems() if v >80}'''过滤集合'''set1=set(datalist)print {x for x in set1 if x%3==0}

0 0