android类之Pattern
来源:互联网 发布:网络黄牛是什么意思 编辑:程序博客网 时间:2024/06/05 17:57
- Patterns are compiled regular expressions. In many cases, convenience methods such as
String.matches
,String.replaceAll
andString.split
will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. ThePattern
class and its companion,Matcher
, also offer more functionality than the small amount exposed byString
. - 模式是编译正则表达式。在许多情况下,像String.matches,String.replaceAll和String.split等方法会更简便一些,但如果你需要用同一个正则表达式做大量工作,编译它之后然后重新利用可能会更高效。该Pattern类和Matcher类,相较于少量处理能力的String提供更多功能。
// String convenience methods: boolean sawFailures = s.matches("Failures: \\d+"); String farewell = s.replaceAll("Hello, (\\S+)", "Goodbye, $1"); String[] fields = s.split(":"); // Direct use of Pattern: Pattern p = Pattern.compile("Hello, (\\S+)"); Matcher m = p.matcher(inputString); while (m.find()) { // Find each match in turn; String can't do this. String name = m.group(1); // Access a submatch group; String can't do this. }
Regular expression syntax 正则语法
- Java supports a subset of Perl 5 regular expression syntax. An important gotcha is that Java has no regular expression literals, and uses plain old string literals instead. This means that you need an extra level of escaping. For example, the regular expression
\s+
has to be represented as the string"\\s+"
. - Java支持Perl5正则语法的子集。一个重要的问题是Java没有正则表达字符集,需使用普通的字符串代替。这意味着你需要一个额外的隔离方式表达。例如,正则表达式\s+不得不被\\s+代替。
- Escape sequences
- \ Quote the following metacharacter(so \. matches a literal .).
- \ 引用其后的元字符(\.代表.)
- \Q Quote all following metacharacters until \E.
- \Q 引用其后所有的元字符,直到\E结束。
- \E Stop quoting metacharacters(started by \Q).
- \E 停止引用元字符(从\Q开始的)。
- \\ A literal backslash.
- \\ 反斜杠字符。
- \uhhhh The Unicode character U+hhh(in hex).
- \uhhhh Unicode字符U+hhh(用十六进制)。
- \xhh The Unicode character U+00hh(in hex).
- \xhh Unicode字符 U+00hh(用十六进制)。
- \cx The ASCII control character ^x(so \cH would be ^H, U+0008).
- \cx ASCII控制符^x(\cH代表^H)。
- \a The ASCII bell character(U+0007).
- \a ASCII贝尔字符。
- \e The ASCII ESC charcter(U+001b).
- \e ASCII 退出字符。
- \f The ASCII form feed character (U+000c).
- \f ASCII 换页字符。
- \n The ASCII newline character(U+00a).
- \n 换行符。
- \r The ASCII carriage return character(U+000d).
- \r 回车符。
- \t The ASCII tab character(U+0009).
- \t tab键字符。
- Character classes
- It's possible to construct arbitrary character classes using set operations:
- 使用集合运算构造任意字符是可能的:
- [abc] Any one of a,b, or c.(Enumeration)
- [a-c] Any one of a,b, or c.(Range)
- [^abc] Any character except a,b, or c.(Negation.)
- [[a-f][0-9]] Any character in either range.(Union.)
- [[a-z]&&[jkl]] Any character in both ranges.[Intersection.]
Most of the time,the built-in character classes are more useful:
- \d Any digit character(see note below).
- \D Any non-digit character(see note below).
- \s Any whitespace character(see note below).
- \S Any non-whitespace character(see note below).
- \w Any word character(see note below).
- \W Any non-word character(see note below).
- \p{NAME} Any character in the class with the given NAME.
- \p{NAME} Any character not in the named class.
Note that these built-in classes don't just cover the traditional ASCII range. For example, \w is equivalent to the character class [\p{L1}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]. For more details see Unicode TR-18, and bear in mind that the set of characters in each class can vary between Unicode releases. If you actually want to match only ASCII characters, specify the explicit characters you want; if you mean 0-9 use [0-9] rather than \d, which would also include Gurmukhi digits and so forth.
Threre are also a variety of named classes:
- Unicode category names, prefixed by Is. For example \p{IsLu} for all uppercase letters.
- POSIX class names. These are 'Alnum', 'ASCII', 'Blank','Cntrl','Digit','Graph','Lower','Print','Punct','Upper','XDigit'.
- Unicode block names, as used by Character.UnicodeBlock.forName(java.lang.String) prefixed by In. For example \p{InHebrew} for all characters in the Hebrew block.
- Character method names. These are all non-deprecated methods from Character whose name starts with is, but with the is replaced by java. For example, \p{javaLowerCase}.
Quantifiers 量词
Quantifiers match some number of instances of the preceding regular expression.
- * Zero or more.
- ? Zero or one.
- + One or more.
- {n} Exactly n.
- {n,} At least n.
- {n,m} At least n but not more than m.
Quantifiers are "greedy" by default, meaning that they will match the longest possible input sequence. There are also non-greedy quantifiers that match the shortest possible input sequence. They're same as the greedy ones but with a trailing?:
- *? Zero or more(non-greedy).
- ?? Zero or one(non-greedy).
- +? One or more(non-greedy).
- {n}? Exactly n(non-greedy).
- {n,}? At least n(non-greedy).
- {n,m}? At least n but not more than m(non-greedy).
Quantifiers allow backtracking by default. There are also possessive quantifiers to prevent backtracking. They're same as the greedy ones but with a trailing+:
- *+ Zero or more (possessive).
- ?+ Zero or one(possessive).
- ++ One or more(possessive).
- {n}+ Exactly n(possessive).
- {n,}+ At least n(possessive).
- {n,m}+ At least n but not more than m(possessive).
Zero-Width assertions 零宽度断言
- ^ At beginning of line.
- $ At end of line.
- \A At beginning of input.
- \b At word boundary.
- \B At non-word boundary.
- \G At end of previous match.
- \z At end of input.
- \Z At end of input, or before newline at end.
还有一些其他的字符,以后等用到再添加吧。
- \p{NAME} Any character not in the named class.
0 0
- android类之Pattern
- 正则表达式之 pattern+?、pattern*?、(?!pattern)、(?:pattern)
- Android设计模式之建造者模式(Builder Pattern)
- Android设计模式之建造者模式(builder pattern)
- Android设计模式之享元模式(Flyweight Pattern)
- Android设计模式之建造者模式(Builder Pattern)
- Android设计模式之观察者模式(Observer Pattern)
- Android设计模式之代理模式(Proxy Pattern)
- Android设计模式之装饰者模式(Decorator Pattern)
- Android设计模式之单例模式(Singleton Pattern)
- Pattern类
- Pattern类
- pattern类
- Design Pattern之初见
- leetcode之Word Pattern
- leetcode之Word Pattern
- 正则表达式之Pattern
- Creational Pattern之simple factory pattern
- 两个java对象相同属性赋值
- 【玩转cocos2d-x之二十】从CCObject看cocos2d-x的内存管理机制
- 栈 poj1068 Parencodings
- CentOS 7.0关闭默认防火墙firewalld,使用iptables
- 【玩转cocos2d-x之二十一】多线程和同步01-pthread库
- android类之Pattern
- H264视频的sps和pps解析和哥伦布编码
- hdu 1022 Train Problem I
- 杭电 2026 首字母变大写
- 【玩转cocos2d-x之二十二】多线程和同步02-售票
- System.ComponentModel.DataAnnotations.Schema 冲突
- ADO.NET (二)—— ADO和ADO .NET对比
- Zookeeper在hbase集群的作用
- 【第四篇章-android平台MediaCodec】解决Observer died. Quickly, do something, ... anything...