「学习笔记——Python」Python标准库简明教程II

来源：互联网发布：安卓运行php 编辑：程序博客网时间：2024/04/20 21:25

11 Python标准库简明教程 II

1 输出格式
2 模板
3 处理二进制数据
4 多线程
5 日志(Logging)
6 弱引用(Weak References)
7 与List配合使用的工具
8 十进制浮点数据运算

1 输出格式

repr 模块提供了一个定制版本的repr()，用于得到有大量内容的容器的缩写形式

>>> import repr>>> repr.repr(set('helloword'))"set(['d', 'e', 'h', 'l', 'o', 'r', ...])"

pprint 模块提供了更多复杂的控制，用于打印内置的以及用户定义的对象，输出的结果可读性比较好。

>>> import pprint>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']]]]>>> pprint.pprint(t, width=30)[[[['black', 'cyan'],   'white',   ['green', 'red']]]]

textwrap 模块可以格式化段落以适应屏幕宽度

>>> import textwrap>>> doc = """The wrap() method is just like fill() except that it returns... a list of strings instead of one big string with newlines to separate... the wrapped lines.""">>> >>> print textwrap.fill(doc, width = 30)The wrap() method is just likefill() except that it returnsa list of strings instead ofone big string with newlinesto separate the wrapped lines.

locacle 模块可以根据文化和地区决定数据格式：

>>> locale.setlocale(locale.LC_ALL, 'zh_CN.utf8')'zh_CN.utf8'>>> conv = locale.localeconv()>>> x = 1234567.8>>> locale.format("%d", x, grouping=True)'1,234,567'

2 模板

string 模块提供了一个多才多艺的 Template 类，它语法简单，普通用户也可以方便地编辑，这使得用户可以定制它们的应用。

>>> from string import Template>>> t = Template('${village}folk send $$10 to $cause.')>>> t.substitute(village='Nottingham', case='the ditch fund')'Nottinghamfolk send $10 to the ditch fund.'

注意，$可以开始定义一个点位符，这个点位符将来可以被真正的数据替换。如果想输入真正的，可以连续输入两个.如果substitute中没有指定模板中点位符的值，将返回KeyError.如果使用safe_ substitute,没有指定的点位符会被忽略。

>>> t = Template('Return the $item to $owner.')>>> t.substitute(item='value')Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/usr/lib/python2.7/string.py", line 172, in substitute    return self.pattern.sub(convert, self.template)  File "/usr/lib/python2.7/string.py", line 162, in convert    val = mapping[named]KeyError: 'owner'>>> t.safe_substitute(item='value')'Return the value to $owner.'

Template 子类可以指定限定符，即把$换成其它符号也可以，例如下面这个批量改名的程序。

import timeimport os.pathfrom string import Templatephotofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']class BatchRename(Template):    delimiter = '%'fmt = raw_input('Enter rename style (%d-date %n-seqnum %f-format):  ')t = BatchRename(fmt)date = time.strftime('%d%b%y')for i, filename in enumerate(photofiles):    base, ext = os.path.splitext(filename)    newname = t.substitute(d=date, n=i, f=ext)    print '{0} --> {1}'.format(filename, newname)

>>> import tempEnter rename style (%d-date %n-seqnum %f-format):  Steve_%n%fimg_1074.jpg --> Steve_0.jpgimg_1076.jpg --> Steve_1.jpgimg_1077.jpg --> Steve_2.jpg

3 处理二进制数据

struct 包提供了 pack() 和 unpack() 函数，处理不同长度的二进制记录格式的数据。下面的代码展示了如何不使用 zipfile 模块处理ZIP文件的头信息。"H"和"I"分别代表两字节，四字节无符号数字。 "<"代表标准大小，小尾端字节顺序。

import structdata = open('tzipfile.zip', 'rb').read()start = 0for i in range(3):                      # show the first 3 file headers    start += 14    fields = struct.unpack('<IIIHH', data[start:start+16])    crc32, comp_size, uncomp_size, filenamesize, extra_size = fields    start += 16    filename = data[start:start+filenamesize]    start += filenamesize    extra = data[start:start+extra_size]    print filename, hex(crc32), comp_size, uncomp_size    start += extra_size + comp_size     # skip to the next header

$ zipinfo tzipfile.zip Archive:  tzipfile.zipZip file size: 442 bytes, number of entries: 3-rw-rw-r--  3.0 unx        0 bx stor 13-Mar-08 14:31 tzipfile.c-rw-rw-r--  3.0 unx        0 bx stor 13-Mar-08 14:38 f2.c-rw-rw-r--  3.0 unx        0 bx stor 13-Mar-08 14:38 f3.c3 files, 0 bytes uncompressed, 0 bytes compressed:  0.0%$ python tstruct.py tzipfile.c 0x0 0 0f2.c 0x0 0 0f3.c 0x0 0 0

4 多线程

线程是一种可以将没有序列信赖的任务分开执行的技术，可以用于提高程序的响应。例如等待用户输入时，在后台同时运行其它任务。

import threading, zipfileclass AsyncZip(threading.Thread):    def __init__(self, infile, outfile):        threading.Thread.__init__(self)        self.infile = infile        self.outfile = outfile    def run(self):        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)        f.write(self.infile)        f.close()        print 'Finished background zip of: ', self.infilebackground = AsyncZip('mydata.txt', 'myarchive.zip')background.start()print 'The main program continues to run in foreground.'background.join()    # Wait for the background task to finishprint 'Main program waited until background was done.'

多线程程序的挑战在于协调好共享数据和资源的线程。线程模块提取了大量同步原语，例如锁，事件，条件变量，信号量。虽然这些工具非常强大，但是低级的设计错误常常导致一些不可重现的错误。所以，比较好的方式是使用一个线程访问资源，使用 Queue 模块让其它线程依次得到数据。

5 日志(Logging)

logging 模块提供了灵活多样的日志系统，最简单的情况是将日志信息发送到一个文件或者 sys.stderr:

>>> import logging>>> logging.debug("Debugging infor")>>> logging.info("Infomation mess")>>> logging.warning("Warning mess")WARNING:root:Warning mess>>> logging.error("Error mess")ERROR:root:Error mess>>> logging.critical('Critical error')CRITICAL:root:Critical error

日志系统可以由Python直接配置，也可以使用可编辑的配置文件来定制。

6 弱引用(Weak References)

Python会自动进行内存管理，当最后一次引用被消除后就会释放内存。这种机制多数情况下运行良好，但是有时候需要跟踪对象，看它什么时候被引用。但是跟踪对象本身就是对对象的引用。 weakref 模块提供了跟踪对象，但又不创建引用的方式，当对象不再需要时，会自动从 weakref 表中移除。

>>> import weakref, gc>>> class A:...     def __init__(self, value):...         self.value = value...     def __repr__(self):...         return str(self.value)...>>> a = A(10)                   # create a reference>>> d = weakref.WeakValueDictionary()>>> d['primary'] = a            # does not create a reference>>> d['primary']                # fetch the object if it is still alive10>>> del a                       # remove the one reference>>> gc.collect()                # run garbage collection right away0>>> d['primary']                # entry was automatically removedTraceback (most recent call last):  File "<stdin>", line 1, in <module>    d['primary']                # entry was automatically removed  File "C:/python26/lib/weakref.py", line 46, in __getitem__    o = self.data[key]()KeyError: 'primary'

7 与List配合使用的工具

内置的list类型可以满足许多数据结构的需求，但是有时候出于性能方面的权衡，可以需要一些其它的类似 list的数据结构。 array 模块提取了一个 array() 对象，只存储同类型数据并且可以以更加紧凑的方式存储。下面这个例子可以以每项2字节的方式存储无符号二进制数，而不是list的16字节。

>>> from array import array>>> a = array('H', [1,2,3])>>> sum(a)6>>> a[1:3]array('H', [2, 3])>>>

collections 模块提供了 deque() 对象，与list非常相似，但是可以更快地从左侧添加删除数据，同时从中间查找的速度变慢。这一对象适合于实现队列和宽度优先搜索树。

>>> from collections import deque>>> d = deque(["task1", "task2", "task3"])>>> d.append("task4")>>> print "Handleing", d.popleft()Handleing task1

除此之外，标准库还提取了例如 bisect 模块用于操作已排序的list.

>>> import bisect>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'python')]>>> bisect.insort(scores, (300, 'ruby'))>>> scores[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'python')]

heapq 模块提供了一些函数，用于在普通lists的基础上实现堆。值最小的项通常放在位置０。这在经常访问最小元素，但是不需要对整个list排序时非常有用.

>>> from heapq import heapify, heappop, heappush>>> data = [1,3,4,6,2,7]>>> heapify(data)>>> heappush(data, -5)>>> [heappop(data) for i in range(3)][-5, 1, 2]

8 十进制浮点数据运算

decimal 模块提供　*Decimal* 数据类型用于十进制浮点数运算。与内置的float相比，这个类在以下方面非常有用：

金融以及其它需要精确十进制表示的应用
控制精度
控制舍入，满足合法可控的需求
跟踪重要的十进制位
用户希望结果和手工计算的相同的应用

例如下面的计算，使用十进制浮点计算和二进制浮点计算结果就不相同

>>> from decimal import *>>> x = Decimal('0.70') * Decimal('1.05')>>> xDecimal('0.7350')>>> x.quantize(Decimal('0.01'))Decimal('0.74')>>> 0.7 * 1.050.735>>> round(.70 * 1.05, 2)0.73>>> 1.999999999+0.00000000000000000000000011.999999999>>> Decimal('1.999999999') + Decimal('0.0000000000000000000000001')Decimal('1.9999999990000000000000001')

原文链接：http://docs.python.org/2/tutorial/stdlib2.html