Python学习笔记（七）—— List.sort 与二进制搜索bisect

来源：互联网发布：郑州公交线路查询软件编辑：程序博客网时间：2024/06/06 15:43

代码及内容源自《Fluent Python》——Luciano Ramalho 著

List.sort方法会直接进行排序操作，过程中既不会复制原list，也不会生成新的list对象。
与之相反，sorted()函数则会生成并返回一个新的list。

>>> fruits=['grape','raspberry','apple','banana']>>> sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

>>> fruits

['grape', 'raspberry', 'apple', 'banana']

>>> sorted(fruits, reverse=True)

['raspberry', 'grape', 'banana', 'apple']

>>> sorted(fruits, key=len)

['grape', 'apple', 'banana', 'raspberry']

>>> fruits

['grape', 'raspberry', 'apple', 'banana']

>>> fruits.sort()>>> fruits

['apple', 'banana', 'grape', 'raspberry']

序列经过排序后，可以提高搜索的效率。幸运的是，Python标准库中已经提供了标准二进制搜索算法 bisect模块。
bisect(haystack, needle)能够对haystack中是否存在needle进行二进制搜索，但前提是搜索对象必须是经过排序的序列。从而在保证haystack升序排列的同时，确定可以插入needle的位置。

import bisectimport sysHAYSTACK = [1,4,5,6,8,12,15,20,21,23,23,26,29,30]NEEDLES = [0,1,2,5,8,10,22,23,29,30,31]ROW_FMT = '{0:2d} @ {1:2d}    {2}{0:2d}'def demo(bisect_fn):    for needle in reversed(NEEDLES):        position = bisect_fn(HAYSTACK, needle)        offset = position * '   |'        print(ROW_FMT.format(needle, position, offset))if __name__ == '__main__':   if sys.argv[-1] == 'left':        bisect_fn = bisect.bisect_left   else:        bisect_fn = bisect.bisectprint('DEMO:', bisect_fn.__name__)print('haysrtack ->','  '.join('%2d' % n for n in HAYSTACK))demo(bisect_fn)

DEMO: bisecthaysrtack ->  1   4   5   6   8  12  15  20  21  23  23  26  29  3031 @ 14       |   |   |   |   |   |   |   |   |   |   |   |   |   |3130 @ 14       |   |   |   |   |   |   |   |   |   |   |   |   |   |3029 @ 13       |   |   |   |   |   |   |   |   |   |   |   |   |2923 @ 11       |   |   |   |   |   |   |   |   |   |   |2322 @  9       |   |   |   |   |   |   |   |   |2210 @  5       |   |   |   |   |10 8 @  5       |   |   |   |   | 8 5 @  3       |   |   | 5 2 @  1       | 2 1 @  1       | 1 0 @  0     0DEMO: bisect_lefthaysrtack ->  1   4   5   6   8  12  15  20  21  23  23  26  29  3031 @ 14       |   |   |   |   |   |   |   |   |   |   |   |   |   |3130 @ 13       |   |   |   |   |   |   |   |   |   |   |   |   |3029 @ 12       |   |   |   |   |   |   |   |   |   |   |   |2923 @  9       |   |   |   |   |   |   |   |   |2322 @  9       |   |   |   |   |   |   |   |   |2210 @  5       |   |   |   |   |10 8 @  4       |   |   |   | 8 5 @  2       |   | 5 2 @  1       | 2 1 @  0     1 0 @  0     0

bisect实际上是bisect_right的别名，与之相对应的另一个函数是bisect_left。二者的区别在于：bisect_right的插入点在目标位置之后，而bisect_left的插入点在目标位置之前。
bisect的一个有趣的应用是根据数值进行表格的查询，如下例中将测试成绩转换为字母表示的等级。

def grade(score, breakpoints=[60,70,80,90], grades='FDCBA'):    i = bisect.bisect(breakpoints,score)    return grades[i][grade(score) for score in [33,99,77,70,89,90,100]]

['F', 'A', 'C', 'C', 'B', 'A', 'A']

给序列排序是一件耗费资源的工作，因此一旦你已经得到了一个排序好的序列，就不要轻易破坏它。虽然，可以利用bisect(haystack, needle)来获得位置的index，再利用haystack.insert(index,needle)实现不改变排序的插入操作；但是，利用bisect.insort更方便而且更快捷。

import bisectimport randomSIZE = 7random.seed(1729)my_list=[]for i in range(SIZE):    new_item=random.randrange(SIZE*2)    bisect.insort(my_list, new_item)    print('%2d ->' % new_item, my_list)

10 -> [10] 0 -> [0, 10] 6 -> [0, 6, 10] 8 -> [0, 6, 8, 10] 7 -> [0, 6, 7, 8, 10] 2 -> [0, 2, 6, 7, 8, 10]10 -> [0, 2, 6, 7, 8, 10, 10]

阅读全文

0 0