QT中的QRegExp学习（正则表达式）

来源：互联网发布：零基础学算法pdf 编辑：程序博客网时间：2024/06/05 06:24

QT中的QRegExp学习（正则表达式）
下面是例子：
[ABCD] 或 [A-D] 表示匹配A B C D中的其中一个
x{1,1} 匹配x一次，仅一次 x{1,5} 表示匹配x次数>=1 <=5
[0-9]{1,1} 匹配0-9中的数字一次，仅一次 [0-9]{1,2} 匹配[0 99]中的数字但是这个也可以匹配一个string中间的数字，若想匹配一个string ^[0-9]{1,2}$ ^表示字符头 $表示字符尾这2个字符只定义一个string，它不匹配字符中的字符
?表示出现0或1次
[0-9] = \d
x{1,1} = x
[0 99]=^\d{1,2}$ =^\d\d{0,1}$ = ^\d\d?$
匹配mail的写法：m{1,1}a{1,1}i{1,1}l{1,1}
| 表示或
mail|letter|correspondence. 匹配 'mail' 或 'letter' 或 'correspondence' 但是这个表达式也可以匹配'email'，更改方法如下： (mail|letter|correspondence) ()表示里面是一个小的正则表达式单元，可以在后面使用的
\b 表示一个单词的开始和结尾如空格换行非字符单词开始和结尾都是单词间的分隔符如 \b(mail|letter|correspondence)\b
(?!__) 预取符号 &(?!amp;)表示取 &后不跟amp的字符串
计算字符串中的Eric 和 Eirik 出现的次数， \b(Eric|Eirik)\b 或 \bEi?ri[ck]\b（此也可以匹配 'Eric', 'Erik', 'Eiric' and 'Eirik'.）
字符匹配表：
c A character represents itself unless it has a special regexp meaning. e.g. c matches the character c. 字符匹配字符仅表示自己
\c A character that follows a backslash matches the character itself, except as specified below. e.g., To match a literal caret at the beginning of a string, write \^.
\a Matches the ASCII bell (BEL, 0x07).
\f Matches the ASCII form feed (FF, 0x0C).
\n Matches the ASCII line feed (LF, 0x0A, Unix newline).
\r Matches the ASCII carriage return (CR, 0x0D).
\t Matches the ASCII horizontal tab (HT, 0x09).
\v Matches the ASCII vertical tab (VT, 0x0B).
\xhhhh Matches the Unicode character corresponding to the hexadecimal number hhhh (between 0x0000 and 0xFFFF).
\0ooo (i.e., \zero ooo) matches the ASCII/Latin1 character for the octal number ooo (between 0 and 0377).
. (dot) Matches any character (including newline). 匹配任意字符，包括新行
\d Matches a digit (QChar::isDigit()). 匹配数字
\D Matches a non-digit.匹配非数字
\s Matches a whitespace character (QChar::isSpace()). 匹配空格
\S Matches a non-whitespace character.匹配非空格
\w Matches a word character (QChar::isLetterOrNumber(), QChar::isMark(), or '_'). 匹配字母数字记号 '_'
\W Matches a non-word character. 匹配非word字符
\n The n-th backreference, e.g. \1, \2, etc.
To include a \ in a regexp, enter it twice, i.e. \\. To match the backslash character itself, enter it four times, i.e. \\\\.
[abc] 匹配 'a' or 'b' or 'c',
[^abc] 匹配 'a' or 'b' or 'c'.之外的
- 表示范围. [W-Z] 匹配 'W' or 'X' or 'Y' or 'Z'.
E表示正则表达式
E? 出现次数 0 or 1
E+ 或 E{1,} >=1
E* >=0
E{n} =n
E{n,}>=n
E{,m} <=m
E{n,m} <=n <=m
tag+ 表示 tag tagg tagggg taggggg 等
(tag)+ 表示 tag tagtag tagtagtag tagtagtagtag 等
正则表达式贪心匹配，若想不贪心，设置 setMinimal()
\b(\w+)\W+\1\b \1表示 (\w+) 相同这个表示有回望操作
(?:green|blue) 此表示无回望操作回望操作 begin '(?:' and end ')' 表示只对是否匹配成功感兴趣，对匹配的内容，匹配到的位置等不感兴趣
* 将会最大限度的匹配
a*(a*) 将会匹配 aaa 中的aaa可以调用cap()得到匹配的值，指定QRegExp::RegExp2或setPatternSyntax(QRegExp::RegExp2)可以改变匹配贪婪度
断言：进行声明，不进行匹配，下面E代码正则表达式
^ string开头 \\^表示^
$ string结尾 \\$表示$
\b word边界
\B 非word边界
(?=E) 正向前匹配 const(?=\s+char) 匹配 'static const char *' 中的 const 而 const\s+char 匹配 const char
(?!E) 负向前匹配 const(?!\s+char) 匹配 const 后面没跟空格+char的
通配符
setPatternSyntax() 可以在正则表达式和通配符之间转换，后者比前者更简单，并且后者只有4个特性：
任意字符表示自己
? 匹配任意的单个字符
* 匹配 >=0 个字符
[...]

1 0