python初学-文件处理-re的函数

来源：互联网发布：家教帮下载软件编辑：程序博客网时间：2024/05/16 05:45

re的几个函数

1： compile(pattern, [flags])
根据正则表达式字符串 pattern 和可选的flags 生成正则表达式对象

生成正则表达式对象

其中flags有下面的定义：
I 表示大小写忽略
L 使一些特殊字符集，依赖于当前环境
M 多行模式使 ^ $ 匹配除了string开始结束外，还匹配一行的开始和结束
S “.“ 匹配包括‘/n’在内的任意字符，否则 . 不包括‘/n’
U Make /w, /W, /b, /B, /d, /D, /s and /S dependent on the Unicode character properties database
X 这个主要是表示，为了写正则表达式，更可毒，会忽略一些空格和#后面的注释

其中S比较常用，
应用形式如下
import re
re.compile(……,re.S)

2： match(pattern,string,[,flags])
让string匹配，pattern，后面分flag同compile的参数一样
返回MatchObject 对象

3： split( pattern, string[, maxsplit = 0])
用pattern 把string 分开
>>> re.split(’/W+’, ‘Words, words, words.’)
['Words', 'words', 'words', '']
括号‘（）’在pattern内有特殊作用，请查手册

4：findall( pattern, string[, flags])
比较常用，
从string内查找不重叠的符合pattern的表达式，然后返回list列表

5：sub( pattern, repl, string[, count])
repl可以时候字符串，也可以是函数
当repl是字符串的时候，
就是把string 内符合pattern的子串，用repl替换了

当repl是函数的时候，对每一个在string内的，不重叠的，匹配pattern
的子串，调用repl（substring），然后用返回值替换substring

>>> re.sub(r’def/s+([a-zA-Z_][a-zA-Z_0-9]*)/s*/(/s*/):’,
… r’static PyObject*/npy_/1(void)/n{’,
… ‘def myfunc():’)
’static PyObject*/npy_myfunc(void)/n{’

>>> def dashrepl(matchobj):
… if matchobj.group(0) == ‘-’: return ‘ ‘
… else: return ‘-’
>>> re.sub(’-{1,2}’, dashrepl, ‘pro—-gram-files’)
‘pro–gram files’