正则表达式

来源:互联网 发布:网络鬼故事 编辑:程序博客网 时间:2024/06/16 10:27

查找中文:[\u4e00-\u9fa5]



最近在eclipse开发中用到正则表达式搜索替换,顺便总结。


搜索:^(.*)<h:outputText.*value=.*"#\{(.*)\}"(.*)$
替换:$1<h:outputText value= "#{strings.trim($2,30)}"$3


实现的功能是能找到代码中类似:

<h:outputText value="#{appl.departmentName}" />

的行。并替换为:

<h:outputText value= "#{strings.trim(appl.applicantName,30)}" />                                  

在上一组搜索表达式基础上作了些改进:

搜索:^(.*)<h:outputText(.*\R??.*)value=\p{Space}*"#\{(.*)\}"(.*)$
替换:$1<h:outputText$2value="#{strings.trim($3,30)}"$4

这组条件考虑到了中间带换行的情况。


这种正则表达式的编写有3个问题要注意:

1.有些特殊字符需要转意。例如:{  (   这种字符被正则表达式语法赋予了新意义,因此要用\{  \( 来代表原字符。

2.用( )对来截取字段中需要保留的变化部分。再用$1 , $2 , $3 ...$n 在替换字段里复用他们。


3.注意eclipse里的换行是 \R  ,不是下面文档里的 \r  。



以下是正则表达式语法的详细列举:


Construct Matches  Characters Character classes Predefined character classes POSIX character classes (US-ASCII only) Classes for Unicode blocks and categories Boundary matchers Greedy quantifiers Reluctant quantifiers Possessive quantifiers Logical operators Back references Quotation Special constructs (non-capturing)xThe character x\\The backslash character\0nThe character with octal value 0n (0 <= n <= 7)\0nnThe character with octal value 0nn (0 <= n <= 7)\0mnnThe character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7)\xhhThe character with hexadecimal value 0xhh\uhhhhThe character with hexadecimal value 0xhhhh\tThe tab character ('\ ')\nThe newline (line feed) character ('\ ')\rThe carriage-return character ('\ ')\fThe form-feed character ('\ ')\aThe alert (bell) character ('\\u0007')\eThe escape character ('\\u001B')\cxThe control character corresponding to x[abc]ab, or c (simple class)[^abc]Any character except ab, or c (negation)[a-zA-Z]a through z or A through Z, inclusive (range)[a-d[m-p]]a through d, or m through p[a-dm-p] (union)[a-z&&[def]]de, or f (intersection)[a-z&&[^bc]]a through z, except for b and c[ad-z] (subtraction)[a-z&&[^m-p]]a through z, and not m through p[a-lq-z](subtraction).Any character (may or may not match line terminators)\dA digit: [0-9]\DA non-digit: [^0-9]\sA whitespace character: [ \t\n\x0B\f\r]\SA non-whitespace character: [^\s]\wA word character: [a-zA-Z_0-9]\WA non-word character: [^\w]\p{Lower}A lower-case alphabetic character: [a-z]\p{Upper}An upper-case alphabetic character:[A-Z]\p{ASCII}All ASCII:[\x00-\x7F]\p{Alpha}An alphabetic character:[\p{Lower}\p{Upper}]\p{Digit}A decimal digit: [0-9]\p{Alnum}An alphanumeric character:[\p{Alpha}\p{Digit}]\p{Punct}Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~\p{Graph}A visible character: [\p{Alnum}\p{Punct}]\p{Print}A printable character: [\p{Graph}]\p{Blank}A space or a tab: [ \t]\p{Cntrl}A control character: [\x00-\x1F\x7F]\p{XDigit}A hexadecimal digit: [0-9a-fA-F]\p{Space}A whitespace character: [ \t\n\x0B\f\r]\p{InGreek}A character in the Greek block (simple block)\p{Lu}An uppercase letter (simple category)\p{Sc}A currency symbol\P{InGreek}Any character except one in the Greek block (negation)[\p{L}&&[^\p{Lu}]] Any letter except an uppercase letter (subtraction)^The beginning of a line$The end of a line\bA word boundary\BA non-word boundary\AThe beginning of the input\GThe end of the previous match\ZThe end of the input but for the final terminator, if any\zThe end of the inputX?X, once or not at allX*X, zero or more timesX+X, one or more timesX{n}X, exactly n timesX{n,}X, at least n timesX{n,m}X, at least n but not more than m timesX??X, once or not at allX*?X, zero or more timesX+?X, one or more timesX{n}?X, exactly n timesX{n,}?X, at least n timesX{n,m}?X, at least n but not more than m timesX?+X, once or not at allX*+X, zero or more timesX++X, one or more timesX{n}+X, exactly n timesX{n,}+X, at least n timesX{n,m}+X, at least n but not more than m timesXYX followed by YX|YEither X or Y(X)X, as a capturing group\nWhatever the nth capturing group matched\Nothing, but quotes the following character\QNothing, but quotes all characters until \E\ENothing, but ends quoting started by \Q(?:X)X, as a non-capturing group(?idmsux-idmsux) Nothing, but turns match flags on - off(?idmsux-idmsux:X)  X, as a non-capturing group with the given flags on - off(?=X)X, via zero-width positive lookahead(?!X)X, via zero-width negative lookahead(?<=X)X, via zero-width positive lookbehind(?<!X)X, via zero-width negative lookbehind(?>X)X, as an independent, non-capturing group
原创粉丝点击