day5_常用模块

来源：互联网发布：js 淘宝左侧分类导航编辑：程序博客网时间：2024/05/26 20:20

模块

定义：用来从逻辑上组织python代码（变量，函数，类，逻辑：实现一个功能），本质就是.py结尾的python文件（文件名：test.py，对应的模块名：test）
“包”的定义：用来从逻辑上组织模块的，本质就是一个目录（必须带有一个init.py文件）

模块导入方法

import module_nameimport module1_name,module2_namefrom module_alex import *from module_alex import m1,m2,m3from module_alex import logger as logger_alex

本质
导入模块本质就是把python文件解释一遍。
导入包的本质就是执行该包下的init.py文件。
模块的分类：
1、标准库
2、开源模块
3、自定义模块

time模块

时间戳(timestamp)：通常来说，时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”，返回的是float类型。
格式化的时间字符串(Format String)
结构化的时间(struct_time)：struct_time元组共有9个元素共九个元素:(年，月，日，时，分，秒，一年中第几周，一年中第几天，夏令时)

time.time返回当前时间的时间戳(1970年纪元后经过的浮点秒数)

print(time.time())#输出1507362268.5010645

time.localtime返回本地时间的struct _time的格式的对象

x = time.localtime()print(x)#输出time.struct_time(tm_year=2017, tm_mon=10, tm_mday=7, tm_hour=15, tm_min=49, tm_sec=4, tm_wday=5, tm_yday=280, tm_isdst=0)

time.gmtime返回当前utc时间(伦敦时间)

x = time.gmtime()print(x)#输出time.struct_time(tm_year=2017, tm_mon=10, tm_mday=7, tm_hour=7, tm_min=51, tm_sec=37, tm_wday=5, tm_yday=280, tm_isdst=0)

time.asctime返回时间格式

print(time.asctime())#输出Sat Oct  7 15:54:05 2017

time.strptime把时间格式的字符串转成struct_time格式的时间对象

print(time.strptime("2017-10-10 18:18","%Y-%m-%d %H:%M"))#输出time.struct_time(tm_year=2017, tm_mon=10, tm_mday=10, tm_hour=18, tm_min=18, tm_sec=0, tm_wday=1, tm_yday=283, tm_isdst=-1)

time.mktime把struct_time时间对象转成时间戳

x = time.strptime("2017-10-10 18:18","%Y-%m-%d %H:%M")print(time.mktime(x))#输出1507630680.0

time.strftime时间对象转换成时间字符串

print(time.strftime("%Y-%m-%d %H:%M.log"))#输出2017-10-07 16:03.log

时间格式转换

这里写图片描述

random模块

random.random() 随机返回一个小数

>>> random.random()0.44309536784055825>>> random.random()0.9474594527425027

random.randint(a,b)随机返回a到b之间任意一个数，包括b

>>> random.randint(1,5)1>>> random.randint(1,5)2>>> random.randint(1,5)5>>> random.randint(1,5)5

random.randrange(1,5)大于等于1且小于5之间的整数

>>> random.randrange(1,5)1>>> random.randrange(1,5)4

random.choice1或者23或者[4,5]

>>> print(random.choice([1,'23',[4,5]]))[4, 5]>>> print(random.choice([1,'23',[4,5]]))1>>> print(random.choice([1,'23',[4,5]]))23

random.sample(a, b)从a中随机获取b个值，以列表的形式返回

>>> random.sample(range(10),3)[5, 3, 7]>>> random.sample(range(10),3)[7, 8, 6]>>> random.sample(range(10),3)[4, 7, 5]

生成随机数

#!/usr/bin/env python# -*- coding:utf-8 -*-#Author:liyananimport randomcheckcode=''for i in range(4):    cuurrent = random.randrange(0,4)    if cuurrent ==i:        tmp = chr(random.randint(65,90))    else:        tmp = random.randint(0,9)    checkcode+=str(tmp)print(checkcode)

os模块

os模块是与操作系统交互的一个接口

os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cdos.curdir  返回当前目录: ('.')os.pardir  获取当前目录的父目录字符串名：('..')os.makedirs('dirname1/dirname2')    可生成多层递归目录os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirnameos.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirnameos.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印os.remove()  删除一个文件os.rename("oldname","newname")  重命名文件/目录os.stat('path/filename')  获取文件/目录信息os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"os.pathsep    输出用于分割文件路径的字符串 win下为;,Linux下为:os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'os.system("bash command")  运行shell命令，直接显示os.environ  获取系统环境变量os.path.abspath(path)  返回path规范化的绝对路径os.path.split(path)  将path分割成目录和文件名二元组返回os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素os.path.exists(path)  如果path存在，返回True；如果path不存在，返回Falseos.path.isabs(path)  如果path是绝对路径，返回Trueos.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回Falseos.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回Falseos.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间os.path.getsize(path) 返回path的大小

sys模块

sys.argv           命令行参数List，第一个元素是程序本身路径sys.exit(n)        退出程序，正常退出时exit(0)sys.version        获取Python解释程序的版本信息sys.maxint         最大的Int值sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值sys.platform       返回操作系统平台名称

shutil模块

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst)
将文件内容拷贝到另一个文件中

with open("f_old",'r',encoding="utf-8") as f1,\    open("f_new","w",encoding="utf-8") as f2:    shutil.copyfileobj(f1,f2)

shutil.copyfile(src, dst)
拷贝文件,目标文件无需存在

shutil.copyfile('f1.log', 'f2.log')

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变，目标文件必须存在

shutil.copymode('f1.log', 'f2.log')

shutil.copystat(src, dst)
仅拷贝状态的信息，包括：mode bits, atime, mtime, flags，目标文件必须存在

shutil.copystat('f1.log', 'f2.log')

hutil.copy(src, dst)
拷贝文件和文件的权限

shutil.copy('f1.log', 'f2.log')

shutil.copy2(src, dst)
拷贝文件和文件的状态

shutil.copy2('f1.log', 'f2.log')

shutil.copytree(src, dst)
递归的去拷贝文件，相当于cp -r，目标目录不能存在，注意对folder2目录父级目录要有可写权限，ignore的意思是排除

shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件，相当于：rm -fr

shutil.rmtree('folder1')

shutil.move(src, dst)
递归的去移动文件，它类似mv命令，其实就是重命名。

shutil.move('folder1', 'folder3')

shelve模块

之前的json和pickle，在python3中只能dump一次和load一次，不能dump多次，和load多次，但是我们真想要dump多次和load多次怎么办呢，并且能事项数据的持久化？shelve模块比pickle模块简单，只有一个open函数，返回类似字典的对象，可读可写;key必须为字符串，而值可以是python所支持的数据类型。

#!/usr/bin/env python# -*- coding:utf-8 -*-#Author:liyananimport shelvewith shelve.open("info.txt") as f:    print(f['test'])    print(f['info'])    print(f["func"]("li",24))

xml处理模块

xml是实现不同语言或者程序之间进行数据交换的协议，跟json差不多，但是json使用起来更简单，不过，在json没有诞生，只能选择xml
xml的格式

<?xml version="1.0"?><data>    <country name="Liechtenstein">        <rank updated="yes">2</rank>        <year>2008</year>        <gdppc>141100</gdppc>        <neighbor name="Austria" direction="E"/>        <neighbor name="Switzerland" direction="W"/>    </country>    <country name="Singapore">        <rank updated="yes">5</rank>        <year>2011</year>        <gdppc>59900</gdppc>        <neighbor name="Malaysia" direction="N"/>    </country>    <country name="Panama">        <rank updated="yes">69</rank>        <year>2011</year>        <gdppc>13600</gdppc>        <neighbor name="Costa Rica" direction="W"/>        <neighbor name="Colombia" direction="E"/>    </country></data>

用python操作xml

import xml.etree.ElementTree as ETtree = ET.parse("xmltest.xml")root = tree.getroot()print(root.tag)#遍历xml文档for child in root:    print(child.tag, child.attrib)    for i in child:        print(i.tag,i.text)#输出datacountry {'name': 'Liechtenstein'}rank 2year 2008gdppc 141100neighbor Noneneighbor Nonecountry {'name': 'Singapore'}rank 5year 2011gdppc 59900neighbor Nonecountry {'name': 'Panama'}rank 69year 2011gdppc 13600neighbor Noneneighbor None

#只遍历year 节点for node in root.iter('year'):    print(node.tag,node.text)#输出datayear 2008year 2011year 2011

修改和删除xm内容

import xml.etree.ElementTree as ETtree = ET.parse("xmltest.xml")root = tree.getroot()#修改for node in root.iter('year'):    new_year = int(node.text) + 1    node.text = str(new_year)    node.set("updated","yes")tree.write("xmltest.xml")#删除nodefor country in root.findall('country'):   rank = int(country.find('rank').text)   if rank > 50:     root.remove(country)tree.write('output.xml')

configpaarser模块

在很多情况下，我们都需要修改配置文件，但是，有些配置文件，如mysql数据库的配置文件怎么修改,生产和修改常见配置文件的模块：configparser。
php.ini或者nginx.ini很多文件的格式：

[DEFAULT]ServerAliveInterval = 45Compression = yesCompressionLevel = 9ForwardX11 = yes[bitbucket.org]User = hg[topsecret.server.com]Port = 50022ForwardX11 = no

使用python实现：

import configparserconfig = configparser.ConfigParser()config["DEFAULT"] = {'ServerAliveInterval': '45',                      'Compression': 'yes',                     'CompressionLevel': '9'}config['bitbucket.org'] = {}config['bitbucket.org']['User'] = 'hg'config['topsecret.server.com'] = {}topsecret = config['topsecret.server.com']topsecret['Host Port'] = '50022'     # mutates the parsertopsecret['ForwardX11'] = 'no'  # same hereconfig['DEFAULT']['ForwardX11'] = 'yes'with open('example.ini', 'w') as configfile:   config.write(configfile)

hashlib模块

hash：一种算法 ,3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法
三个特点：
1.内容相同则hash运算结果相同，内容稍微改变则hash值则变
2.不可逆推
3.相同算法：无论校验多长的数据，得到的哈希值长度固定。

#!/usr/bin/env python# -*- coding:utf-8 -*-#Author:liyananimport hashlibm = hashlib.md5()m.update('hello'.encode('utf-8'))print(m.hexdigest())m.update('liyanan'.encode('utf-8'))print(m.hexdigest())m2=hashlib.md5()m2.update('helloliyanan'.encode('utf-8'))print(m2.hexdigest())#输出5d41402abc4b2a76b9719d911017c59263ea30488d421ef4aa783ba96a3f332263ea30488d421ef4aa783ba96a3f3322

由上面的代码可以看出，你读到最后一行的字符串的MD5值跟一下子读取所有内容的MD5值是一样的,这是为什么呢？其实这边update做了一个拼接功能。以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。

hash = hashlib.sha256('898oaFs09f'.encode('utf-8'))hash.update('liyanan'.encode('utf-8'))print(hash.hexdigest())#输出0fe2abe645f87b442fa1a58508efd5368e8cc526653bfdb9ff7cc2bcaac06752

python 还有一个 hmac 模块，它内部对我们创建 key 和内容进行进一步的处理然后再加密:

import hmach = hmac.new('liyanan'.encode('utf-8'))  #keyh.update('hello'.encode('utf-8')) #内容print(h.hexdigest())#输出03d7e06a679fafa86fc10d36b5fbfbd0

注意：
要想保证hmac最终结果一致，必须保证：
1、hmac.new括号内指定的初始key一样
2、无论update多少次，校验的内容累加到一起是一样的内容。

longging模块

很多程序都有记录日志的需求，并且日志中包含的信息即有正常的程序访问日志，还可能有错误，警告等信息输出，python的logging模块提供了标准的日志接口，你可以通过它存储各种格式的日志，logging的日志可以分为debug，info，warning，error和critical 5个级别。
简单用法：
日志级别有五个，分别是：debug，info，warning，error和critical，其中debug级别最低，critical级别最高，级别越低，打印的日志等级越多。

#!/usr/bin/env python# -*- coding:utf-8 -*-#Author:liyananimport logginglogging.debug("logging debug")logging.info("logging info")logging.warning("logging warning")logging.error("longging error")logging.critical("longging critical")#输出WARNING:root:logging warningERROR:root:longging errorCRITICAL:root:longging critical

注：从上面可以看出，一个模块默认的日志级别是warning

日志写入文件

#!/usr/bin/env python# -*- coding:utf-8 -*-#Author:liyananimport logginglogging.basicConfig(filename="info.txt",level=logging.INFO) #输入文件名和日志级别logging.debug("logging debug")logging.info("logging info")logging.warning("logging warning")#输出到文件INFO:root:logging infoWARNING:root:logging warning

这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，所以debug日志没有记录，如果想记录，则级别设置成debug也就是level=loggin.DEBUG

加入日期格式

import logginglogging.basicConfig(filename="info.txt",                    level=logging.INFO,format = '%(asctime)s %(module)s:%(levelname)s %(message)s',datefmt='%m/%d/%Y %H:%M:%S %p'                    ) #输入文件名和日志级别logging.debug("logging debug")logging.info("logging info")logging.warning("logging warning")#输出到文件INFO:root:logging infoWARNING:root:logging warning10/08/2017 11:01:04 AM ��־д���ļ�:INFO logging info10/08/2017 11:01:04 AM ��־д���ļ�:WARNING logging warning10/08/2017 11:02:04 AM ��־д���ļ�:INFO logging info10/08/2017 11:02:04 AM ��־д���ļ�:WARNING logging warning

format的日志格式

%(name)s
Logger的名字
%(levelno)s
数字形式的日志级别
%(levelname)s
文本形式的日志级别
%(pathname)s
调用日志输出函数的模块的完整路径名，可能没有
%(filename)s
调用日志输出函数的模块的文件名
%(module)s
调用日志输出函数的模块名
%(funcName)s
调用日志输出函数的函数名
%(lineno)d
调用日志输出函数的语句所在的代码行
%(created)f
当前时间，用UNIX标准的表示时间的浮点数表示
%(relativeCreated)d
输出日志信息时的，自Logger创建以来的毫秒数
%(asctime)s
字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒
%(thread)d
线程ID。可能没有
%(threadName)s
线程名。可能没有
%(process)d
进程ID。可能没有
%(message)s
用户输出的消息

复杂日志的输出
之前的写法感觉要么就输入在屏幕上，要么就是输入在日志里面，那我们有没有既可以输出在日志上，又输出在日志里面呢？很明显，当然可以。下面我们就来讨论一下，如何使用复杂的日志输出。

python使用logging模块记录日志涉及的四个主要类：
logger：提供了应用程序可以直接使用的接口。
handler：将(logger创建的)日志记录发送到合适的目的输出。
filter：提供了细度设备来决定输出哪条日志记录。
formatter：决定日志记录的最终输出格式。

re模块

就其本质而言，正则表达式（或 RE）是一种小型的、高度专业化的编程语言，（在Python中）它内嵌在Python中，并通过 re 模块实现。你可以为想要匹配的相应字符串集指定规则；该字符串集可能包含英文语句、e-mail地址、TeX命令或任何你想搞定的东西。然后你可以问诸如“这个字符串匹配该模式吗？”或“在这个字符串中是否有部分匹配该模式呢？”。你也可以使用 RE 以各种方式来修改或分割字符串。

常用的正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']'?'     匹配前一个字符1次或0次'{m}'   匹配前一个字符m次'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC''(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c'\'     转义[a-z]   匹配[a-z][A-Z]   匹配[A-Z][0-9]   匹配数字0-9'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的'\Z'    匹配字符结尾，同$'\d'    匹配数字0-9'\D'    匹配非数字'\w'    匹配[A-Za-z0-9]'\W'    匹配非[A-Za-z0-9]'s'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t''(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city")结果:{'province': '3714', 'city': '81', 'birthday': '1993'}

常用方法

re.match 从头开始匹配re.search 匹配包含re.findall 把所有匹配到的字符放到以列表中的元素返回re.splitall 以匹配到的字符当做列表分隔符re.sub      匹配字符并替换

反斜杠
与大多数编程语言相同，正则表达式里使用”\”作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符”\”，那么使用编程语言表示的正则表达式里将需要4个反斜杠”\\”：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r”\”表示。同样，匹配一个数字的”\d”可以写成r”\d”。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

阅读全文

0 0