正则表达式的捕获组(capture group)在Java中的使用

来源:互联网 发布:qq堂 for mac 编辑:程序博客网 时间:2024/06/05 15:39

捕获组分类

  1. 普通捕获组(Expression)
  2. 命名捕获组(?<name>Expression)

普通捕获组

从正则表达式左侧开始,每出现一个左括号“(”记做一个分组,分组编号从1开始。0代表整个表达式。

对于时间字符串:2017-04-25,表达式如下

(\\d{4})-((\\d{2})-(\\d{2}))

有4个左括号,所以有4个分组

编号 捕获组 匹配 0 (\d{4})-((\d{2})-(\d{2})) 2017-04-25 1 (\d{4}) 2017 2 ((\d{2})-(\d{2})) 04-25 3 (\d{2}) 04 4 (\d{2}) 25


public static final String DATE_STRING = "2017-04-25";public static final String P_COMM = "(\\d{4})-((\\d{2})-(\\d{2}))";Pattern pattern = Pattern.compile(P_COMM);Matcher matcher = pattern.matcher(DATE_STRING);matcher.find();//必须要有这句System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1));System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2));System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3));System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4));

命名捕获组

每个以左括号开始的捕获组,都紧跟着“?”,而后才是正则表达式。

对于时间字符串:2017-04-25,表达式如下

(?<year>\\d{4})-(?<md>(?<month>\\d{2})-(?<date>\\d{2}))

有4个命名的捕获组,分别是

编号 名称 捕获组 匹配 0 0 (?\d{4})-(?(?\d{2})-(?\d{2})) 2017-04-25 1 year (?\d{4})- 2017 2 md (?(?\d{2})-(?\d{2})) 04-25 3 month (?\d{2}) 04 4 date (?\d{2}) 25


命名的捕获组同样也可以使用编号获取相应值

public static final String P_NAMED = "(?<year>\\d{4})-(?<md>(?<month>\\d{2})-(?<date>\\d{2}))";public static final String DATE_STRING = "2017-04-25";Pattern pattern = Pattern.compile(P_NAMED);Matcher matcher = pattern.matcher(DATE_STRING);matcher.find();System.out.printf("\n===========使用名称获取=============");System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));System.out.printf("\n matcher.group('year') value:%s", matcher.group("year"));System.out.printf("\nmatcher.group('md') value:%s", matcher.group("md"));System.out.printf("\nmatcher.group('month') value:%s", matcher.group("month"));System.out.printf("\nmatcher.group('date') value:%s", matcher.group("date"));matcher.reset();System.out.printf("\n===========使用编号获取=============");matcher.find();System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1));System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2));System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3));System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4));

PS:非捕获组

在左括号后紧跟“?:”,而后再加上正则表达式,构成非捕获组(?:Expression)。

对于时间字符串:2017-04-25,表达式如下

(?:\\d{4})-((\\d{2})-(\\d{2}))

这个正则表达式虽然有四个左括号,理论上有4个捕获组。但是第一组(?:\d{4}),其实是被忽略的。当使用matcher.group(4)时,系统会报错。

编号 捕获组 匹配 0 (\d{4})-((\d{2})-(\d{2})) 2017-04-25 1 ((\d{2})-(\d{2})) 04-25 2 (\d{2}) 04 3 (\d{2}) 25


public static final String P_UNCAP = "(?:\\d{4})-((\\d{2})-(\\d{2}))";public static final String DATE_STRING = "2017-04-25";Pattern pattern = Pattern.compile(P_UNCAP);Matcher matcher = pattern.matcher(DATE_STRING);matcher.find();System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1));System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2));System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3));// Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 4System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4));

总结

  1. 普通捕获组使用方便;
  2. 命名捕获组使用清晰;
  3. 非捕获组目前在项目中还没有用武之地。
0 0
原创粉丝点击