jdk7正则表达式-命名捕获组(named capture)
来源:互联网 发布:网络手游 编辑:程序博客网 时间:2024/05/22 17:29
jdk6之前的正则表达式不支持命名捕获组功能,只能通过捕获组的索引来访问捕获组.当正则表达式比较复杂的时候,里面含有大量的捕获组和非捕获组,通过从左至右数括号来得知捕获组的计数也是一件很烦人的事情;而且这样做代码的可读性也不好,当正则表达式需要修改的时候也会改变里面捕获组的计数.解决这个问题的方法是通过给捕获组命名来解决,就像Python, PHP, .Net 以及Perl这些语言里的正则表达式一样.
新引入的命名捕获组支持如下:
(1) (?<NAME>X) to define a named group "NAME"
(2) \k<Name> to backref a named group "NAME"
(3) ${NAME} to reference to captured group in matcher's replacement str
(4) group(String NAME) to return the captured input subsequence by the given "named group"
再看一下反向引用和替换字符串的例子:
String input = "aabbbccdddef";
如何把这个字符串拆成[aa, bbb, cc, ddd, e, f]这样的数组?
参考:http://www.iteye.com/news/6195。但是,这里面的说法在jdk的实际实现中有改动,主要是在${}这块。
Pattern类的doc:
Back references
\n Whatever the nth capturing group matched
The replacement string may contain references to subsequences captured during the previous match:
Each occurrence of ${name} or $g will be replaced by the result of evaluating the corresponding group(name) or group(g) respectively.
For $g, the first number after the $ is always treated as part of the group reference. Subsequent numbers are incorporated into g if they would form a legal group reference. Only the numerals '0' through '9' are considered as potential components of the group reference.
If the second group matched the string "foo", for example, then passing the replacement string "$2bar" would cause "foobar" to be appended to the string buffer.A dollar sign ($) may be included as a literal in the replacement string by preceding it with a backslash (\$).
新引入的命名捕获组支持如下:
(1) (?<NAME>X) to define a named group "NAME"
(2) \k<Name> to backref a named group "NAME"
(3) ${NAME} to reference to captured group in matcher's replacement str
(4) group(String NAME) to return the captured input subsequence by the given "named group"
举两个例子来看一下:
public static void indexedCaptureTest(){//jdk6之前的使用方式 String names = "fred or barney"; Matcher m = Pattern.compile("(\\w+) or (\\w+)").matcher(names); if(m.find()){ System.out.println(m.group(1)+","+m.group(2)); } } public static void namedCaptureTest(){//jdk7可以给捕获组命名 String names = "fred or barney"; Matcher m = Pattern.compile("(?<name1>\\w+) or (?<name2>\\w+)").matcher(names); if(m.find()){ System.out.println(m.group("name1")+","+m.group("name2")); } }
再看一下反向引用和替换字符串的例子:
String input = "aabbbccdddef";
如何把这个字符串拆成[aa, bbb, cc, ddd, e, f]这样的数组?
public static void indexedCaptureReplace(){ String input = "aabbbccdddef"; String regex = "((.)+?)(?!\\2)"; String temp = input.replaceAll(regex, "$1,"); String[] arr = temp.split(","); System.out.println(java.util.Arrays.toString(arr)); } public static void namedCaptureReplace(){ String input = "aabbbccdddef"; String regex = "(?<name2>(?<name1>.)+?)(?!\\k<name1>)";//好丑陋的实现!ugly! String temp = input.replaceAll(regex, "${name2},"); String[] arr = temp.split(","); System.out.println(java.util.Arrays.toString(arr)); }
参考:http://www.iteye.com/news/6195。但是,这里面的说法在jdk的实际实现中有改动,主要是在${}这块。
Pattern类的doc:
Back references
\n Whatever the nth capturing group matched
\k<name> Whatever the named-capturing group "name" matched
The replacement string may contain references to subsequences captured during the previous match:
Each occurrence of ${name} or $g will be replaced by the result of evaluating the corresponding group(name) or group(g) respectively.
For $g, the first number after the $ is always treated as part of the group reference. Subsequent numbers are incorporated into g if they would form a legal group reference. Only the numerals '0' through '9' are considered as potential components of the group reference.
If the second group matched the string "foo", for example, then passing the replacement string "$2bar" would cause "foobar" to be appended to the string buffer.A dollar sign ($) may be included as a literal in the replacement string by preceding it with a backslash (\$).
- jdk7正则表达式-命名捕获组(named capture)
- JDK7 正则表达式 捕获组命名
- Java 正则表达式 - 命名捕获组
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- 正则基础之——捕获组(capture group)
- Python正则表达式--无捕获组和命名组
- 深入入门正则表达式(java) - 命名捕获
- 深入入门正则表达式(java) - 命名捕获
- 正则表达式的捕获组(capture group)在Java中的使用
- .NET正则表达式的命名捕获
- [疯狂Java]正则表达式:捕获组、反向引用、捕获组命名
- centos mysql 安装
- 不进队消息和进队消息
- VC编译选项里面如何增加 win32 unicode release项
- 简单的多任务操作系统
- 再谈三范式
- jdk7正则表达式-命名捕获组(named capture)
- 如何在Linux内核里增加一个系统调用
- RayCommand操作系统的实现笔记3--GDT的介绍
- MySQL的mysqldump工具的基本用法
- TCP/IP编程示例
- windows socket api 函数
- C 回调函数
- 第二周测试赛
- C++ 一条代码打印vector内容以及random_shuffle函数