正则表达式

来源：互联网发布：南风知我意书包网编辑：程序博客网时间：2024/06/08 05:50

BasicPatterns

The power ofregular expressions is that they can specify patterns, not justfixed characters. Here are the most basic patterns which matchsingle chars:

a, X, 9, < -- ordinary characters just matchthemselves exactly. The meta-characters which do not matchthemselves because they have special meanings are: . ^ $ * + ? { [] \ | ( ) (details below)

. (a period) -- matches any single character except newline'\n'

\w -- (lowercase w) matches a "word" character: a letter ordigit or underbar [a-zA-Z0-9_]. Note that although "word" is themnemonic for this, it only matches a single word char, not a wholeword. \W (upper case W) matches any non-word character.

\b -- boundary between word and non-word

\s -- (lowercase s) matches a single whitespace character --space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S)matches any non-whitespace character.

\t, \n, \r -- tab, newline, return

\d -- decimal digit [0-9] (some older regex utilities do notsupport but \d, but they all support \w and \s)

^ = start, $ = end -- match the start or end of the string

\ -- inhibit the "specialness" of a character. So, for example,use \. to match a period or \\ to match a slash. If you are unsureif a character has special meaning, such as '@', you can put aslash in front of it, \@, to make sure it is treated just as acharacter.

Repetition

Things get moreinteresting when you use + and * to specify repetition in thepattern

+ -- 1or more occurrences of the pattern to its left, e.g. 'i+' = one ormore i's
* -- 0or more occurrences of the pattern to its left
? --match 0 or 1 occurrences of the pattern to its left

0 0