10 JAVA 正则表达式

来源:互联网 发布:手机淘宝收藏店铺在哪 编辑:程序博客网 时间:2024/06/15 00:07

- 字符串匹配 match

- 字符串查找 find

- 字符串替换 replace


相关类:

java.lang.String

- matches() 匹配整个字符串,而不是字符串的子串

- find() 查找符合模式的子串,下次匹配是从不匹配的字符那个位置开始匹配

- lookingAt() 每次匹配都是从字符串起始位置开始,与matches不同的是,它可以只匹配子串

- replaceAll() 替换整个字符串中和正则表达式匹配的字符

- split() 使用正则表达式来匹配分隔符

java.util.regex.Pattern 要匹配的模式

java.util.regex.Matcher 匹配之后产生的结果

java.util.regex.PatternSyntaxException


MetaCharacters:

- .一个字符,不包含换行符

- +一个或多个

- ?0个或一个

- * 0个或多个

- {n} {n,} {n,m}

- \d表示一个数字,与[0-9]具有相同的效果

- []表示在一个范围内获取一个字符,在括号中可以使用一些逻辑运算符^、|、&&

[a-zA-Z] [a-z]|[A-Z] 


预定义字符:

\d

\D

\s

\S

\w = [0-9a-zA-Z_]

\W = [^\w]


边界:

^ 行开头

$ 行结束

\b 匹配单词的边界,前一个字符和后一个字符不全是\w

\B 表示非单词边界


分组:

IP地址的匹配 - ((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)

每个分组会自动拥有一个组号,0对应整个正则表达式,从左到右,为未分配组名分配组号(从1开始),第二标给命名组分配组号,可以使用(?:exp)语法来剥夺一个分组对组号分配的参与权

- (?=exp)

- (?<=exp)

- (?!exp)

断言此位置的后面不能匹配exp,只是匹配一个位置,而不吞掉这个字符

- (?<!exp)

断言此位置的前面不能匹配exp

例子:

匹配tag - (?<=<(\w+)>).*(?=<\/\1>)


贪婪与懒惰:

aabab

a.*b会贪婪匹配整个字符串,而懒惰匹配会匹配尽量少的字符,如a.*?b去匹配,会返回aab

- greedy,没有匹配上会让步(回吐)

- reluctant,?懒惰匹配

- possessive,+没有匹配上也不让步


Practices:

1. 基础复习

package TestExpression;import java.util.Arrays;import java.util.regex.Matcher;import java.util.regex.Pattern;public class TestRegex {    public static void main(String[] args) {        System.out.println("lalala$$lala".replaceAll("$", "~"));        System.out.println("lalala$$lala".replaceAll("\\$", "~"));        System.out.println(Arrays.asList("192.168.1.1".split("\\.")));        System.out.println("192.168.1.1".split("\\."));        System.out.println("curry2".matches(".+\\d")); //双反斜杠完成转义        Pattern p = Pattern.compile("[a-z]{3}");        Matcher m = p.matcher("ab3");        System.out.println(m.matches()); //返回布尔值来判断它是否匹配        "ab3".matches("[a-z]{3}");        System.out.println("192.168.0.224".matches("(\\d{1,3}\\.){3}\\d{1,3}"));        System.out.println(" \n\r\t".matches("\\s{4}"));        System.out.println("|".matches("\\|"));        System.out.println("&&".matches("&\\&"));        System.out.println("b".matches("a&&b")); //&&返回交集        System.out.println("\\".matches("\\\\"));        Pattern pattern = Pattern.compile("\\d{3,5}");        Matcher matcher = pattern.matcher("127-123-1981-092-98461-00");        //System.out.println(matcher.matches());        //matcher.reset();        if(matcher.lookingAt())            System.out.println(matcher.start()+","+matcher.end());        if(matcher.find()) //从不匹配位置开始查找,查找是否有匹配的字符串            System.out.println(matcher.start()+","+matcher.end());        if(matcher.find())            System.out.println(matcher.start()+","+matcher.end());        //java.lang.IllegalStateException:No match available        /*System.out.println(matcher.find());        System.out.println(matcher.start()+","+matcher.end());        */        matcher.reset();        System.out.println(matcher.replaceAll("*")); //String类的replaceAll就是调用Matcher的replaceAll方法        /*public String replaceAll(String replacement) {            reset();            boolean result = find();            if (result) {                StringBuffer sb = new StringBuffer();                do {                    appendReplacement(sb, replacement);                    result = find();                } while (result);                appendTail(sb);                return sb.toString();            }            return text.toString();        }*/        /*public String replaceString(String source, String regex, String replacement, int flags){             Pattern pattern = Pattern.compile(regex, flags);             Matcher matcher = pattern.matcher(source);             StringBuffer buffer = new StringBuffer();             while(matcher.find()){                 matcher.appendReplacement(buffer, replacement);                 // other operations             }             matcher.appendTail(buffer);             return buffer.toString();        }*/        System.out.println("hello hello".matches("(\\b\\w+\\b).\\1"));        System.out.println("hello world".matches("(\\b\\w+\\b).\\1"));        System.out.println("hello hello".matches("(?<P1>\\b\\w+\\b).\\k<P1>")); //可以把<P1>替换成'P1'        //Pattern p2 = Pattern.compile("\\w+(?=ing)");        Pattern p2 = Pattern.compile("(?=ing)\\w+");        Matcher m2 = p2.matcher("singing");        if(m2.find())            System.out.println(m2.start()+","+m2.end());        //Pattern p3 = Pattern.compile("(?<=ad)\\w+");        Pattern p3 = Pattern.compile("\\w+(?<=ad)");        Matcher m3 = p3.matcher("reading");        if(m3.find())            System.out.println(m3.start()+","+m3.end());        Pattern p4 = Pattern.compile("(\\d{3,5})(\\w{3})");        Matcher m4 = p4.matcher("215vsdv6346cas534sdd");        while(m4.find()){            System.out.println("group : " + m4.group());            System.out.println("group1 : " + m4.group(1)+";" + " group2 : " + m4.group(2));        }        System.out.println(m4.groupCount());        //compile(regex, flags) flags是一些常量,比如大小写不敏感    }}

2. 邮箱抓取

"[\\w.-]+@[\\w.-]+\\.\\w+"

3. 代码统计

import java.io.BufferedReader;import java.io.File;import java.io.FileNotFoundException;import java.io.FileReader;import java.io.IOException;import java.util.regex.Pattern;public class CodeCounter {private static final String path = "D:\\Eclipse\\Review\\src";private static int whiteSpaces = 0;private static int comments = 0;private static int normalLines = 0;public static void main(String[] args) {String line;File f = new File(path);if(f.exists() && f.isDirectory()) {listFiles(f);} else if(f.exists() && f.isFile() && f.getName().matches(".+\\.java$")) {count(f);} else {System.out.println("file doesn't exist!");System.exit(-1);}System.out.println("whitespace lines: "+ whiteSpaces);System.out.println("comment lines: "+ comments);System.out.println("normal lines: "+ normalLines);}private static void count(File f) {boolean comment = false;try (BufferedReader br = new BufferedReader(new FileReader(f))) {String line = "";while((line = br.readLine()) != null) {line = line.trim();if(line.matches("[\\s&&[^\\n]]*$")) { //readLine自动去掉换行符whiteSpaces++;continue;} if((line.startsWith("/*") && line.endsWith("*/")) || line.matches("//")) {comments++;continue;}if(line.startsWith("/*") && !line.endsWith("*/")) {comment = true;comments++;continue;}if(comment) { //true == commentcomments++;if(line.endsWith("*/"))comment = false;continue;}normalLines++;}} catch (FileNotFoundException e) {System.out.println("file not found!");e.printStackTrace();} catch (IOException e1) {System.out.println("something wrong when handling the file");e1.printStackTrace();}}private static void listFiles(File f) {File[] files = f.listFiles();for(File file : files) {if(file.isDirectory())listFiles(file);else if(file.getName().matches(".+\\.java$"))count(file);}}}


Reference:

1. http://m.blog.csdn.net/article/details?id=51107412

2. http://www.cnblogs.com/deerchao/archive/2006/08/24/zhengzhe30fengzhongjiaocheng.html

3. https://docs.oracle.com/javase/7/docs/api/


0 0
原创粉丝点击