java中正则表达式使用

来源：互联网发布：工资结算数据流程图编辑：程序博客网时间：2024/06/05 19:06

正则表达式是一种用于文字匹配、查找、替换的实用API工具，你可能在多种场合经常遇见它。

在c++、 c# 、 php 、Perl 、JavaScript 等语言中有关于正则的api或系统类，本文讨论的是java中正则表达式的使用。

正则表达式 api文档例子

Pattern 类

正则表达式的编译表示形式。

Pattern p = Pattern.compile("a*b"); Matcher m = p.matcher("aaaaab"); boolean b = m.matches();

仅使用一次

boolean b = Pattern.matches("a*b", "aaaaab");

Matter 类

通过解释 Pattern 对 character sequence 执行匹配操作的引擎。

正则基本语法略过。。。

选择匹配

| 如，想匹配country或countries 则可以使用 country|countries

重复匹配

? + * {n} {n,m} {n,}

非贪婪重复匹配

?? +? *? {n,m}? {n,}?

△ 零宽断言

零宽断言是非获取性匹配，即不会缓存匹配的信息。

（?=）零宽度正预测先行断言如 windows(?=95|98|NT|2000) 能匹配中的windows2000中的windows ，不能匹配windows3.1 中的windows
(?!) 零宽度负预测先行断言如 windows(?!95|98|NT|2000) 能匹配windows3.1 中的windows,不能匹配windows2000中的windows
(?<=) 零宽度正回顾后发断言如windows(?<=95|98|NT|2000) 能匹配windows2000中的2000，不能匹配windows3.1 中的3.1
(?<!) 零宽度负后顾后发断言如windows(?<!95|98|NT|2000) 能匹配windows3.1中的3.1 ，不能匹配windows2000中的2000

△ (pattern) 分组匹配

使用一对小括号将匹配模式表达式括起来，用来表示一个分组。

（可以看出，一个分组同时也是一个子表达式）

 String regex="\\{([\\d\\.-]+),([\\d\\.-]+)\\}";

如这个表达式用来匹配 {3,5.8} 这样的坐标点文本。其中使用了2个小括号，表示整个表达式的2个分组（分组1取x坐标，分组2取y坐标）

 @Test      public void test5(){      String regex="\\{([\\d\\.-]+),([\\d\\.-]+)\\}";          String s="{3,10} ,sdf {4,5}";          Pattern p=Pattern.compile(regex);            Matcher m=p.matcher(s);            System.out.println("分组数量："+m.groupCount());//返回分组的数量             while(m.find()){            System.out.println("分组1:"+m.group(1));            System.out.println("分组2:"+m.group(2));            System.out.println("==========");         }           }

注意java代码中m.group(1) 才是取分组1匹配的串，不是m.group(0)哦。。

1 、2 都是组号，每一个分组都会默认有一个组号，第一对（）的组号为1 ，第二对（）的组号为2，依次类推。

以上正则虽然设置了2个分组，但是没有设置组名。其实在正则表达式的语法中，不仅仅可以设置分组还可以为分组设置组名

  String regex="\\{(?<Word1>[\\d\\.-]+),(?<Word2>[\\d\\.-]+)\\}";

改为这样就给正则的2个分组设置了组名，设置组名的语法是：(?<Word>pattern)，其中Word是用户自定义的组名，pattern是匹配模式字符串。

如上的2个分组的组名就设置为了Word1 和Word2，而后在java代码中就可以通过组名——而不仅仅是通过组号来获取组所匹配的字符串。

   @Test      public void test5(){      String regex="\\{(?<Word1>[\\d\\.-]+),(?<Word2>[\\d\\.-]+)\\}";          String s="{3,10} ,sdf {4,5}";          Pattern p=Pattern.compile(regex);            Matcher m=p.matcher(s);            System.out.println("分组数量："+m.groupCount());//返回分组的数量             while(m.find()){            System.out.println("分组1:"+m.group(1));            System.out.println("分组2:"+m.group(2));            System.out.println("分组名Word1:"+m.group("Word1"));            System.out.println("分组名Word2:"+m.group("Word2"));            System.out.println("==========");         }           }

△ 分组的反向引用

  String regex="\\b(?<Word>\\w+)\\b\\s+\\k<Word>\\b";

通过\k<Word> 组名获取分组中匹配的信息，或者通过组号 \1 \2 \3 ... \n 来获取分组匹配的信息

这可以用来匹配叠词。如 go go , run run 等

   @Test      public void test6(){          String regex="\\b(\\w+)\\b\\s+\\1\\b";      String str="go go";     System.out.println("匹配吗？" +str.matches(regex));//true     }

。。。。。。

0 0