[pyhton]python cookbook学习笔记

来源:互联网 发布:java多级菜单 编辑:程序博客网 时间:2024/06/05 04:15

字符串

  1. 测试一个对象是否是类字符串

    • isinstance

    • type(obj) == type(”)

    • try:    obj + ''except:    return Falseelse:    return True
  2. 字符串对齐

    • string.ljust(width,fill)

    • string.rjust(width,fill)

    • string.center(width,fill)

  3. 去除字符串中包含的空格

    • string.lstrip()

    • string.rstrip()

    • string.strip()

  4. 反转字符串

    • rewords = words[::-1]

    • rewords = words.split()rewords.reverse()rewords = ' '.join(rewords)
    • re.split(pattern,astring)

  5. string.maketrans string.translate 过滤字符串中的某些字符
    keep = ‘abcd’
    all_chars = string.maketrans(”,”) #创建translate table
    del_chars = all_chars.translate(all_chars,keep)
    s.translate(all_chars,del_chars)

  6. 判断字符串是二进制还是文本

    import stringfrom __future__  import divisiontext_chars = ''.join(map(chr,range(32,127)))null_trans = string.maketrans('','')def isText(s,text_chars=text_chars,threshold=0.3):    if '\0' in s:        return False    if not s:        retunr True    not_text_chars =s.translate(null_trans,text_chars)    return len(not_text_chars)/len(s) <= threshold
  7. 字符串方法:upper() lower() title() capitalize() isupper() islower() istitle()

  8. 访问(分割字符串)子字符串

    • struct.unpack('3s 5s 4s','asdqwertzxcv') #return ('asd', 'qwert', 'zxcv')

    • def cut(L,cuts):    start = 0    end = 0    for x in cuts:        end = start + x        if end <= len(L):            print L[start:end]            start = end        elif start < len(L):            print L[start:]            break        else:            break
  9. 重新调整行首空格

    def reindent(s,numSpace):    leading_space = numSpace * ' '    lines = [leading_space + line.strip() for line in s.splitlines()]    return '\n'.join(lines)
  10. 替换字符串中某些子字符串
    ‘%(name)s aegfwg’ % {‘name’:’replace’}

    t = string.Template(‘$name agg’)
    t.substitute({‘name’:’replace’})

文件

  1. 获取文件中某一行的内容

    for line_num,line in enumerate(open(thefilepath,'r'),1):
    if line_num == the_desire_line_num:
    res = line
  2. 统计文本文件行数

    • len(open(thefilepath,'r').readlines())
    count = -1for count,line in enumerate(open(thefilepath,'r')):    passcount += 1
  3. tempfile和zipfile的使用。从zipfile中导入模块

    import os,sys,tempfile,zipfilefd,filename = tempfile.mkstemp(suffix='.zip')os.close(fd)zf = zipfile.ZipFile(filename,'w')zf.writestr('hello.py','def func():return "Hello world from " + __file__\n')zf.close()sys.path.insert(0,filename)import helloprint hello.func()
  4. 向windows标准输出输出二进制数据
    python在一般情况下是以文本模式打开sys.stdout的,要输出二进制数据可以使用msvcrt模块

    import sys    if sys.platform == 'win32':        import msvcrt,os        msvcrt.setmode(sys.stdout.fileno(),os.BINARY)
  5. 使用C++的类iostream的语法

    class IOStream(object):    def __init__(self,output=None):        if output is None:            import sys            output = sys.stdout        self.output = output        self.format = '%s'    def __lshift__(self,thing):        if isinstance(thing,IOManipulator):            thing.do(self)        else:            self.output.write(self.format % thing)            self.format = '%s'        return selfclass IOManipulator(object):    def __init__(self,callback_func):        self.func = callback_func    def do(self,stream):        self.func(stream)#处理换行def do_endl(stream):    stream.output.write('\n')    stream.output.flush()endl = IOManipulator(do_endl)#处理字符串格式化,%x16进制def format_hex(stream):    stream.format = '%x'fhex = IOManipulator(format_hex)# ----test----------def test():    count = IOStream()    count << 'like C++\'s class IOStream'if __name__ == '__main__':    test()
  6. 给定两个目录,计算出目录1相对于目录2的相对目录

    例如:/a/b/c/d,/a/b/e/f/g,返回../../e/f/g

    import osdef all_equal(elements):    return len(set(elements)) == 1def common_prefix(*sequences):    common = []    if not sequences:        return [],[]    for elements in zip(*sequences):        if not all_equal(elements):            break        common.append(elements[0])    return common,[sequence[len(common):] for sequence in sequences]def relpath(path1,path2,sep=os.path.sep,pardir=os.path.pardir):    common,[u1,u2] = common_prefix(path1.split(sep),path2.split(sep))    if not common:        return path2    return sep.join([pardir] * len(u1) + u2)def test(path1,path2,sep=os.path.sep):    print 'from','<',path1,'>','to','<',path2,'>','==>',relpath(path1,path2)if __name__ == '__main__':    test('/root/etc/python27/read.txt','/root/etc/mysql/config.cfg','/')    test('/home/lxy/a/b/c.txt','/home/lxy/r/n/g.txt','/')    test(r'',r'C:\MinGW\include\_mingw.h')
  7. 文件版本化,在编辑文件之前,生成一个该文件的拷贝

    def versionFile(file_spec,vtype='copy'):    import os,shutil    if os.path.isfile(file_spec):        if vtype not in ('copy','rename'):            raise ValueError,'Unknow vtype %r' % vtype        root,ext = os.path.splitext(file_spec)        if len(ext) == 4 and ext[1:].isdigit():            version_num = int(ext[1:]) + 1        else:            version_num = 0        for i in xrange(version_num,100):            new_file = '%s.%03d' % (root,i)            if not os.path.exists(new_file):                if vtype == 'copy':                    shutil.copy(file_spec,new_file)                else:                    os.rename(file_spec,new_file)                print '%s successful' % vtype                return True        raise RuntimeError,'Can\'t %s %r,all names taken' % (vtype,file_spec)    else:        print '%s is not a file' % file_spec        return Falseif __name__ == '__main__':    versionFile('test','rename')

时间

time模块简介

GMT时间表示格林威治时间,也就是UTC(世界标准时间)

  • time.gmttime([sec])–>sec表示从纪元(Epoch:1970/1/1 0:0:0)到现在的秒数,返回时间元组(tm_year,tm_mon,tm_mday,tm_hour,tm_min,
    tm_sec,tm_wday,tm_yday,tm_isdst)。

  • time.localtime([sec])–>将世界标准时间转换为本地时区的时间。

  • time.asctime([tuple])–>tuple:时间元组,返回字符串格式的时间。若tuple缺省,默认使用time.localtime()返回的时间元组。

  • time.strftime(format[, tuple]) -> string
    将时间元组转换为格式化字符串,tuple默认使用localtime()返回的时间元组。
    format

    Directive Meaning Notes %a Locale’s abbreviated weekday name. %A Locale’s full weekday name. %b Locale’s abbreviated month name. %B Locale’s full month name. %c Locale’s appropriate date and time representation. %d Day of the month as a decimal number [01,31]. %H Hour (24-hour clock) as a decimal number [00,23]. %I Hour (12-hour clock) as a decimal number [01,12]. %j Day of the year as a decimal number [001,366]. %m Month as a decimal number [01,12]. %M Minute as a decimal number [00,59]. %p Locale’s equivalent of either AM or PM. (1) %S Second as a decimal number [00,61]. (2) %U Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0. (3) %w Weekday as a decimal number [0(Sunday),6]. %W Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0. (3) %x Locale’s appropriate date representation. %X Locale’s appropriate time representation. %y Year without century as a decimal number [00,99]. %Y Year with century as a decimal number. %Z Time zone name (no characters if no time zone exists). %% A literal ‘%’ character.
  • time.strptime(string[,format])
    将按照format格式化好的字符串转换为时间元组

datetime模块

一些简单的例子

>>> import datetime>>> today = datetime.date.today()>>> todaydatetime.date(2015, 4, 18)>>> today + datetime.timedelta(days=1)datetime.date(2015, 4, 19)>>> print today + datetime.timedelta(days=1)2015-04-19>>> today = datetime.datetime.today()>>> todaydatetime.datetime(2015, 4, 18, 12, 57, 40, 115000)
  1. 寻找上一个星期五
>>> import datetime,calendar>>> lastFriday = datetime.date.today()>>> oneday = datetime.timedelta(days=1)>>> while lastFriday.weekday() != calendar.FRIDAY:    lastFriday -= oneday>>> print lastFriday2015-04-17>>> print lastFriday.strftime('%A-%Y/%m/%d')Friday-2015/04/17-----------------第二种方法-------------------------->>> today = datetime.date(2014,3,5)>>> this_weekday = today.weekday()>>> this_weekday2>>> delta_weekday = (this_weekday - calendar.FRIDAY) % 7>>> last_friday = today - datetime.timedelta(days=delta_weekday)>>> print last_friday2014-02-28
  1. 统计歌曲的总播放时间
def totalTimes(times):    td = datetime.timedelta(0)    duration = sum((datetime.timedelta(minutes=m,seconds=s) for m,s in times),td)    return duration

decimal模块

decimal.Decimal(string or int)–>返回一个decimal对象

一些例子:

>>> from decimal import *>>> setcontext(ExtendedContext)>>> Decimal(0)Decimal('0')>>> Decimal('1')Decimal('1')>>> Decimal('-.0123')Decimal('-0.0123')>>> Decimal(123456)Decimal('123456')>>> Decimal('123.45e12345678901234567890')Decimal('1.2345E+12345678901234567892')>>> Decimal('1.33') + Decimal('1.27')Decimal('2.60')>>> Decimal('12.34') + Decimal('3.87') - Decimal('18.41')Decimal('-2.20')>>> dig = Decimal(1)>>> print dig / Decimal(3)0.333333333>>> getcontext().prec = 18>>> print dig / Decimal(3)0.333333333333333333>>> print dig.sqrt()1>>> print Decimal(3).sqrt()1.73205080756887729>>> print Decimal(3) ** 1234.85192780976896427E+58>>> inf = Decimal(1) / Decimal(0)>>> print infInfinity>>> neginf = Decimal(-1) / Decimal(0)>>> print neginf-Infinity>>> print neginf + infNaN>>> print neginf * inf-Infinity>>> print dig / 0Infinity>>> getcontext().traps[DivisionByZero] = 1>>> print dig / 0Traceback (most recent call last):  ...  ...  ...DivisionByZero: x / 0>>> c = Context()>>> c.traps[InvalidOperation] = 0>>> print c.flags[InvalidOperation]0>>> c.divide(Decimal(0), Decimal(0))Decimal('NaN')>>> c.traps[InvalidOperation] = 1>>> print c.flags[InvalidOperation]1>>> c.flags[InvalidOperation] = 0>>> print c.flags[InvalidOperation]0>>> print c.divide(Decimal(0), Decimal(0))Traceback (most recent call last):  ...  ...  ...InvalidOperation: 0 / 0>>> print c.flags[InvalidOperation]1>>> c.flags[InvalidOperation] = 0>>> c.traps[InvalidOperation] = 0>>> print c.divide(Decimal(0), Decimal(0))NaN>>> print c.flags[InvalidOperation]1

python技巧

1.对象拷贝

python中给变量赋值实际上是变量对值(对象)的引用,当通过一个变量修改了值(对象),也会影响到其它引用了相同值(对象)的变量。例如:

>>> a = [1, 2, 3, 4]>>> b = a>>> b[2] = 4>>> a,b([1, 2, 4, 4], [1, 2, 4, 4])

a,b都引用了同一个列表值,当变量b修改了列表值,因为变量a引用同一个列表,所以变量a会受到影响。
如果想要拷贝一个值(对象),可以考虑使用copy模块。

  • copy.copy() 浅拷贝
  • copy.deepcopy() 深拷贝

例子:

>>> a = [[1,2,3],[12,32],[21,32,45]]>>> import copy>>> b = copy.copy(a)>>> b[[1, 2, 3], [12, 32], [21, 32, 45]]>>> b[0] = [2]>>> a,b([[1, 2, 3], [12, 32], [21, 32, 45]], [[2], [12, 32], [21, 32, 45]])>>> b[1][0] = 23>>> a,b([[1, 2, 3], [23, 32], [21, 32, 45]], [[2], [23, 32], [21, 32, 45]])

注意到当执行b[1][0] = 23后,变量a也受到了影响。这是因为copy.copy()只是浅层拷贝,对于嵌套的值而言,还是简单的引用。要想嵌套的拷贝一个值(对象),应该使用copy.deepcopy()。

另外实现拷贝的方法还有

  • list
>>> a = [1,2,3,4]       >>> b = list(a)>>> a,b([1, 2, 3, 4], [1, 2, 3, 4])>>> a[0] = 2>>> a,b([2, 2, 3, 4], [1, 2, 3, 4])
  • dict
  • set
  • L[:] = LL[:]

注意

>>> l = [[]] * 10>>> l[[], [], [], [], [], [], [], [], [], []]>>> l[0].append(2)>>> l[[2], [2], [2], [2], [2], [2], [2], [2], [2], [2]]

之所以会出现执行l[0].append(2)后,所有嵌套列表都追加了2的情况,是因为[[]] * 10是浅层拷贝,每个嵌套列表实际上都是引用同一个值,修改任意一个嵌套列表都会影响其它的嵌套列表。

2.善用列表推导([x for x in L])和生成器表达式((x for x in L))

列表推导是一次生成所有值,而生成器表达式则是每次生成一个值,节省内存。

3.根据index返回列表L[index]的值,如果index超过列表范围,则返回给定的某一个值。

def list_get(L, index, v=None):    if -len(L) <= index < len(L):        return L[index]    else:        return v# ordef list_get(L, index, v=None):    try:        return L[index]    except IndexError:        return v

第一个函数相对第二个来所,效率更快。

4.循环访问序列中的元素和索引

#推荐做法for index, item in enumerate(seq):    process(index)    process(item)#而不是for i in range(len(seq)):    process(seq[i])

5.展开一个嵌套的序列

#递归版本def list_or_tuple(seq):    return True if isinstance(seq,(list,tuple)) else Falsedef nonstring_iterable(seq):    try:        iter(seq)    except TypeError:        return False    else:        return not isinstance(seq,basestring)def flatten(seq, to_expand=list_or_tuple):    for item in seq:        if to_expand(item):            for item in flatten(item, to_expand=list_or_tuple):                yield item        else:            yield item#非递归版本def flatten(seq, to_expand=list_or_tuple):    iterators = [iter(seq)]  #将seq转换为迭代器对象,可以保存遍历的状态    while iterators:        for item in iterators[-1]:            if to_expand(item):                iterators.append(iter(item))                break            else:                yield item        else:            iterators.pop()

6.将二维列表的列变为行

#列表推导[[row[col] for row in matrix] for col in range(len(matrix[0]))]#map,zipmap(list, zip(*matrix))

7.字典方法

  • get(key, val=None)
    如果key存在,则返回D[key];否则返回val。
  • setdefault(key, val)
    如果key存在,则直接返回D[key];否则执行D[key]=val,然后返回D[key]。
  • 创建新字典

    • dict(**kwargs)

    • dict(zip(key_seq, val_seq))

    • dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v. v defaults to None.
  • update(…)
    D.update([E, ]**F) -> None. Update D from dict/iterable E and F.
    If E present and has a .keys() method, does: for k in E: D[k] = E[k]
    If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v
    In either case, this is followed by: for k in F: D[k] = F[k]

8.将列表元素交替的作为键值对创建字典

#第一种方法dict(zip(keys_vals[::2], keys_vals[1::2]))#第二种方法def pairwise(keys_vals):    next = iter(keys_vals).next    while True:        yield next(), next()def dictFromSeq(seq):    return dict(pairwise(seq))

搜索和排序

python中排序方法称为DSU(Decorate-sort-Undecorate)

1.给字典排序

key_vals = [(k, v) for k, v in adict.items()] #装饰key_vals.sort() #排序

2.cmp内置函数

cmp可以比较两个单一元素(int,float),也可以比较可迭代对象。比较过程的代码类似于:

while i < len(a) and i < len(b):    res = cmp(a[i], b[i])    if res:        return res    i += 1return cmp(len(a), len(b)) 

3.模块heapq

优先队列
heapq.heappop()
heapq.heappush()
可用Queue.PriorityQueue代替

4.获取列表中最小的几个元素

  1. 简单粗暴的方法

    alist.sort()alist[:n]
  2. 使用heapq模块

    • 如果事先知道要获取前n个元素,那么可以直接使用nsmallest方法
      heapq.nsmallest(n, alist)
    • 如果事先不知道,可以:
    def isorted(data):    data = list(data) #这里不仅将data转换为list,而且还获得data的一份拷贝    heapq.heapify(data) #将data初始化为heap    while data:        yield heapq.heappop(data)

5. 二分查找

模块bisect
bisect.bisect(a, x[, lo[, hi]])–>index

Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in a[i:] have e > x. So if x already appears in the list, i points just
beyond the rightmost x already there
具体信息可查看python doc

6.获取列表中第n个元素

当列表长度较小,且元素易于比较(如int, float,string)时,可以采用下面这种先排序再取值的方法

data.sort()data[n]

当列表长度非常大,而且元素比较开销较大,那么上面这种先排序再取值的方法效率偏低。

def select(data, n):    data = list(data)    pivot_count = 0    under = []    over = []    uappend = under.append #优化技巧    oappend = over.append    while True:        pivot = random.choice(data)        for item in data:            if item < pivot:                uappend(item)            elif item > pivot:                oappend(item)            else:                pivot_count += 1        if n < len(under):            data = under        elif n < len(under) + pivot_count:            return pivot        else:            data = over            n -= len(under) + pivot_count

7.查找子串

def finditer(text, pattern):
pos = -1
while True:
pos = text.find(pattern, pos + 1)
if pos < 0: break
yield pos

8.

0 0
原创粉丝点击