Python Package >> re Regular ExpressionSyntax>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Special characters>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> . dot>> ^ caret>> $ string end>> * all repetition>> ? le repetition>> *?, +?, ?? non-greedy>> {m} m repetition>> {m,n} [m,n] repetition>> {m,n}? [m,n] non-greedy repetition>> \ backslash>> [] character set >> - separate range >> ^ string start >> $ string end >> \] escape ] >> \- escape - >> \w acceptable >> special characters become normal>> | or>> () match a group>> (?...) extensions, do not create a group >> (?iLmsux) >> i, re.I, ignore case>> L, re.L, locale dependent>> m, re.M, multi-line>> s, re.S, dot matches all>> u, re.U, unicode dependent>> x, re.X, verbose >> (?:...) non-capturing match, cannot be retrieved after matching >> (?P<name>) accessible within rest of re via <name> >> (?P<id>[a-zA-Z]\w*) >> m.group('id'), m.end('id') access by match objects >> (?P=id) by name in re, replacement text given to .sub()(using \g<id>) >> (?P=name) match text matched by <name> group >> (?#...) comment >> (?=...) match next, lookahead assertion Isaac (?=Asimov) will match Isaac only if followed by Asimov >> (?!...) doesn't match next, negative lookahead assertion >> (?<=...) match if precded by ..., positive lookahead assertion (?<=abc)def will find in abcdef... must be fixed length, .* not allowed >> (?<!...) match if not preceded by ..., negetative lookbehind assertion >> (?(id/name)yes-pattern|no-pattern) >> match yes-pattern if <id> or <name> group exists, or match no-pattern no-pattern can be omited (<)?(\w+@\w+(?:\.\w+)+)(?(1)>) will match <user@host.com> or user@host.com, not <user@host.com>> \number match <number> group, number = [1,99] if number start with 0 and followed by 3 digits, it will be treated as octal value>> \A match string start>> \b match string border>> \B match if not string border>> \d match [0-9]/Unicode decimal digits>> \D match if not [0-9]/Unicode decimal digits>> \s match [ \t\n\r\f\v]/Unicode whitespace character>> \S match if not [ \t\n\r\f\v]/Unicode whitespace character>> \w match if [a-zA-Z0-9]/Unicode alphanumeric character>> \W match if not [a-zA-Z0-9]/Unicode alphanumeric character>> \Z match only string end>> octal value first digit is 0, three octal digits>> escape string literals \a,\b,\f,\n,\r,\t,\v,\x,\\ Matching vs Searching>>>>>>>>>>>>>>>>>>>>>>>>>> >> match check for a match only at the beginning of the string re.match('c','abcdef') # no match>> search check for a match anywhere in the string re.search('c','abcdef') # matchReference>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Module Contents<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< >> re.compile(pattern, flags=0) >> re.DEBUG, debug information >> re.I, re.IGNORECASE >> re.L, re.LOCALE make \w, \W, \b, \B, \s, \S dependent on current locale >> re.M, re.MULTILINE >> re.S, re.DOTALL >> re.U, re.UNICODE >> re.X, re.verbose, ignore whitespace and # comment >> re.search(pattern, string, flags=0) return MatchObject|None >> re.match(pattern, string, flags=0) return MatchObject|None >> re.split(pattern, string, maxsplit=0, flags=0) >> re.findall(pattern, string, flags=0) return string list >> re.finditer(pattern, string, flags=0) return iterator yielding MatchObject instance >> re.sub(pattern, repl, string, count=0, flags=0) >> re.subn(pattern, repl, string, count=0, flags=0) return tuple(new_string, number_of_subs_made) >> re.escape(string)>> re.purge() clear regular expression cache>> re.error Regular expression objects<<<<<<<<<<<<<<<<<<<<< >> class re.RegexObject >> search(string[,pos[.endpos]]) return MatchObject|None >> match(string[,pos[,endpos]]) return MatchObject|None >> split(string, maxsplit=0) >> findall(string[,pos[,endpos]]) >> finditer(string[,pos[,endpos]]) >> sub(repl, string, count=0) >> subn(repl, string, count=0) >> flags >> groups >> groupindex >> pattern Match Objects<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< >> class re.MatchObject >> expand(template) doing backslash substitution on template string >> group([group1,...]) >> groups([default]) return tuple with all subgroups >> groupdict([default]) return dictionary with all subgroups >> start([group]) >> end([group]) >> span([group]) >> pos >> endpos >> lastindex >> lastgroup >> re >> string