http://www.regular-expressions.info/gnu.html

来源:互联网 发布:seo的优化 编辑:程序博客网 时间:2024/05/01 07:32
Quick Start
Tutorial
Tools & Languages
Examples
Reference
Book Reviews
RegexBuddyEasily use the power of GNU regular expressions with RegexBuddy.
Create, analyze and edit BRE and ERE regex patterns with RegexBuddy's intuitive regex building blocks. Convert patterns between the GNU BRE and ERE syntax, and other regex flavors.Get your own copy of RegexBuddy now.

GNU Regular Expression Extensions

GNU, which is an acronym for "GNU's Not Unix", is a project that strives to provide the world with free and open implementations of all the tools that are commonly available on Unix systems. Most Linux systems come with the full suite of GNU applications. This obviously includes traditional regular expression utilities like grep, sed and awk.

GNU's implementation of these tools follows the POSIX standard, with added GNU extensions. The effect of the GNU extensions is that both theBasic Regular Expressions flavor and theExtended Regular Expressions flavor provide exactly the same functionality. The only difference is that BRE's will use backslashes to give various characters a special meaning, while ERE's will use backslashes to take away the special meaning of the same characters.

GNU Basic Regular Expressions (grep, ed, sed)

The Basic Regular Expressions or BRE flavor is pretty much the oldest regular expression flavor still in use today. The GNU utilitiesgrep, ed and sed use it. One thing that sets this flavor apart is that most metacharacters require a backslash to give the metacharacter its flavor. Most other flavors, including GNU ERE, use a backslash to suppress the meaning of metacharacters. Using a backslash to escape a character that is never a metacharacter is an error.

A BRE supports POSIX bracket expressions, which are similar to character classes in other regex flavors, with a few special features. Other features using the usual metacharacters are thedot to match any character except a line break, the caret and dollar to match the start and end of the string, and the star to repeat the token zero or more times. To match any of these characters literally, escape them with a backslash.

The other BRE metacharacters require a backslash to give them their special meaning. The reason is that the oldest versions of UNIX grep did not support these. The developers of grep wanted to keep it compatible with existing regular expressions, which may use these characters as literal characters. The BRE a{1,2} matchesa{1,2} literally, while a\{1,2\} matchesa or aa. Tokens can be grouped with\( and \). Backreferences are the usual \1 through \9. Only up to 9 groups are permitted. E.g. \(ab\)\1 matchesabab, while (ab)\1 is invalid since there's no capturing group corresponding to the backreference\1. Use \\1 to match \1 literally.

On top of what POSIX BRE provides as described above, the GNU extension provides\? and \+ as an alternative syntax to \{0,1\} and \{1,\}. It adds alternation via \|, something sorely missed in POSIX BREs. These extensions in fact mean that GNU BREs have exactly the same features as GNU EREs, except that+, ?, |, braces and parentheses need backslashes to give them a special meaning instead of take it away.

GNU Extended Regular Expressions (egrep, awk, emacs)

The Extended Regular Expressions or ERE flavor is used by the GNU utilities egrep and awk and the emacs editor. In this context, "extended" is purely a historic reference. The GNU extensions make the BRE and ERE flavors identical in functionality.

All metacharacters have their meaning without backslashes, just like in modern regex flavors. You can use backslashes to suppress the meaning of all metacharacters. Escaping a character that is not a metacharacter is an error.

The quantifiers ?, +, {n}, {n,m} and {n,} repeat the preceding token zero or once, once or more, n times, between n and m times, and n or more times, respectively.Alternation is supported through the usual vertical bar |. Unadorned parentheses create a group, e.g.(abc){2} matches abcabc.

POSIX ERE does not support backreferences. The GNU Extension adds them, using the same \1 through\9 syntax.

Additional GNU Extensions

The GNU extensions not only make both flavors identical. They also adds some new syntax and several brand new features. Theshorthand classes \w, \W, \s and \S can be used instead of [[:alnum:]_],[^[:alnum:]_], [[:space:]] and [^[:space:]]. You can use these directly in the regex, but not inside bracket expressions. A backslash inside a bracket expression is always a literal.

The new features are word boundaries and anchors. Like modern flavors, GNU supports \b to match at a position that is at a word boundary, and\B at a position that is not. \< matches at a position at the start of a word, and\> matches at the end of a word. The anchor \` (backtick) matches at the very start of the subject string, while \' (single quote) matches at the very end. These are useful with tools that can match a regex against multiple lines of text at once, as then^ will match at the start of a line, and $ at the end.

Gnulib

GNU wouldn't be GNU if you couldn't use their regular expression implementation in your own (open source) applications. To do so, you'll need todownload Gnulib. Use the includedgnulib-tool to copy the regex module to your application's source tree.

The regex module provides the standard POSIX functions regcomp() for compiling a regular expression,regerror() for handling compilation errors, regexec() to run a search using a compiled regex, andregfree() to clean up a regex you're done with.

Make a Donation

Did this website just save you a trip to the bookstore? Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!

Regex ToolsgrepPowerGREPRegexBuddyRegexMagicGeneral ApplicationsEditPad LiteEditPad ProLanguages & LibrariesDelphiGNU (Linux)GroovyJavaJavaScript.NETPCRE (C/C++)PerlPHPPOSIXPowerShellPythonRRubyTclVBScriptVisual Basic 6wxWidgetsXML SchemaXojoXQuery & XPathXRegExpDatabasesMySQLOraclePostgreSQLMore on This SiteIntroductionRegular Expressions Quick StartRegular Expressions TutorialReplacement Strings TutorialApplications and LanguagesRegular Expressions ExamplesRegular Expressions ReferenceReplacement Strings ReferenceBook ReviewsPrintable PDFAbout This SiteRSS Feed & Blog
0 0
原创粉丝点击