20分钟学会AWK

来源:互联网 发布:小米盒子安装ubuntu 编辑:程序博客网 时间:2024/04/27 16:52

转载自:http://ferd.ca/awk-in-20-minutes.html

Awk in 20 Minutes

What's Awk

Awk is a tiny programming language and a command line tool. It'sparticularly appropriate for log parsing on servers, mostly because Awk willoperate on files, usually structured in lines of human-readable text.

I say it's useful on servers because log files, dump files, or whatevertext format servers end up dumping to disk will tend to grow large, and you'llhave many of them per server. If you ever get into the situation where you haveto analyze gigabytes of files from 50 different servers without tools likeSplunk or its equivalents, it would feelfairly bad to have and download all these files locally to then drive someforensics on them.

This personally happens to me when some Erlang nodes tend to die andleave a crashdump of 700MB to 4GB behind, or on smaller individual servers (say a VPS)where I need to quickly go through logs, looking for a common pattern.

In any case, Awk does more than finding data (otherwise, grepor ack would be enough) — it also lets you process thedata and transform it.

Code Structure

An Awk script is structured simply, as a sequence of patterns and actions:

# commentPattern1 { ACTIONS; }# commentPattern2 { ACTIONS; }# commentPattern3 { ACTIONS; }# commentPattern4 { ACTIONS; }

Every line of the document to scan will have to go through each of thepatterns, one at a time. So if I pass in a file that contains the followingcontent:

this is line 1this is line 2

Then the content this is line 1 will match againstPattern1. If it matches,ACTIONS will be executed.Then this is line 1 will match againstPattern2.If it doesn't match, it skips to Pattern3, and so on.

Once all patterns have been cleared, this is line 2 will gothrough the same process, and so on for other lines, until the input has beenread entirely.

This, in short, is Awk's execution model.

Data Types

Awk only has two main data types: strings and numbers. And even then,Awk likes to convert them into each other. Strings can be interpretedas numerals to convert their values to numbers. If the string doesn'tlook like a numeral, it's0.

Both can be assigned to variables in ACTIONS parts of your codewith the= operator. Variables can be declared anywhere, at anytime, and used even if they're not initialized: their default value is"", the empty string.

Finally, Awk has arrays. They're unidimensional associative arraysthat can be started dynamically. Their syntax is justvar[key] = value. Awk cansimulate multidimensional arrays, but it's all a big hack anyway.

Patterns

The patterns that can be used will fall into three broad categories:regular expressions, Boolean expressions, and special patterns.

Regular and Boolean Expressions

The Awk regular expressions are your run of the mill regexes. They're notPCRE underawk (but gawk will support the fancierstuff — it depends on the implementation! See withawk--version), though for most usages they'll do plenty:

/admin/ { ... }     # any line that contains 'admin'/^admin/ { ... }    # lines that begin with 'admin'/admin$/ { ... }    # lines that end with 'admin'/^[0-9.]+ / { ... } # lines beginning with series of numbers and periods/(POST|PUT|DELETE)/ # lines that contain specific HTTP verbs

And so on. Note that the patterns cannotcapture specificgroups to make them available in theACTIONS part of the code.They are specifically to match content.

Boolean expressions are similar to what you would find in PHP or Javascript.Specifically, the operators&& ("and"), ||("or"), and ! ("not") are available. This is also what you'll findin pretty much all C-like languages. They'll operate on any regular data type.

What's specifically more like PHP and Javascript is the comparison operator,==, which will do fuzzy matching, so that the string"23" compares equal to the number23, such that"23" == 23 is true. The operator != is alsoavailable, without forgetting the other common ones: >,<,>=, and <=.

You can also mix up the patterns: Boolean expressions can be used along withregular expressions. The pattern/admin/ || debug == true is validand will match when a line that contains either the word 'admin' is met, orwhenever the variabledebug is set to true.

Note that if you have a specific string or variable you'd want to matchagainst a regex, the operators~ and !~ are whatyou want, to be used as string ~ /regex/ andstring !~ /regex/.

Also note that all patterns are optional. An Awk script thatcontains the following:

{ ACTIONS }

Would simply run ACTIONS for every line of input.

Special Patterns

There are a few special patterns in Awk, but not that many.

The first one is BEGIN, which matches only beforeany line has been input to the file. This is basically where you can initiatevariables and all other kinds of state in your script.

There is also END, which as you may have guessed, will matchafter the whole input has been handled. This lets you clean up ordo some final output before exiting.

Finally, the last kind of pattern is a bit hard to classify. It's halfwaybetween variables and special values, and they're calledFields, whichdeserve a section of their own.

Fields

Fields are best explained with a visual example:

# According to the following line## $1         $2    $3# 00:34:23   GET   /foo/bar.html# \_____________  _____________/#               $0# Hack attempt?/admin.html$/ && $2 == "DELETE" {  print "Hacker Alert!";}

The fields are (by default) separated by white space. The field$0 represents the entire line on its own, as a string.The field$1 is then the first bit (before any white space),$2 is the one after, and so on.

A fun fact (and a thing to avoid in most cases) is that you canmodify the line by assigning to its field. For example,if you go$0 = "HAHA THE LINE IS GONE" in one block,the next patterns will now operate on that line instead of theoriginal one, and similarly for any other field variable!

Actions

There's a bunch of possible actions, but the most common and usefulones (in my experience) are:

{ print $0; }  # prints $0. In this case, equivalent to 'print' alone{ exit; }      # ends the program{ next; }      # skips to the next line of input{ a=$1; b=$0 } # variable assignment{ c[$1] = $2 } # variable assignment (array){ if (BOOLEAN) { ACTION }  else if (BOOLEAN) { ACTION }  else { ACTION }}{ for (i=1; i<x; i++) { ACTION } }{ for (item in c) { ACTION } }

This alone will contain a major part of your Awk toolbox for casualusage when dealing with logs and whatnot.

The variables are all global. Whatever variables you declare in agiven block will be visible to other blocks, for each line. This severelylimits how large your Awk scripts can become before they're unmaintainablehorrors. Keep it minimal.

Functions

Functions can be called with the following syntax:

{ somecall($2) }

There is a somewhat restricted set of built-in functions available, so Ilike to point toregulardocumentation for these.

User-defined functions are also fairly simple:

# function arguments are call-by-valuefunction name(parameter-list) {     ACTIONS; # same actions as usual}# return is a valid keywordfunction add1(val) {     return val+1;}

Special Variables

Outside of regular variables (global, instantiated anywhere), there is a setof special variables acting a bit like configuration entries:

BEGIN { # Can be modified by the user  FS = ",";   # Field Separator  RS = "\n";  # Record Separator (lines)  OFS = " ";  # Output Field Separator  ORS = "\n"; # Output Record Separator (lines)}{ # Can't be modified by the user  NF          # Number of Fields in the current Record (line)  NR          # Number of Records seen so far  ARGV / ARGC # Script Arguments}

I put the modifiable variables in BEGIN because that's whereI tend to override them, but that can be done anywhere in the script to thentake effect on follow-up lines.

Examples

That's it for the core of the language. I don't have a whole lot of examplesthere because I tend to use Awk for quick one-off tasks.

I still have a few files I carry around for some usage and metrics, myfavorite one being a script used to parse Erlang crash dumps shaped like this:

=erl_crash_dump:0.3Tue Nov 18 02:52:44 2014Slogan: init terminating in do_boot ()System version: Erlang/OTP 17 [erts-6.2] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]Compiled: Fri Sep 19 03:23:19 2014Taints:Atoms: 12167=memorytotal: 19012936processes: 4327912processes_used: 4319928system: 14685024atom: 339441atom_used: 331087binary: 1367680code: 8384804ets: 382552=hash_table:atom_tabsize: 9643used: 6949...=allocator:instroption m: falseoption s: falseoption t: false=proc:<0.0.0>State: RunningName: initSpawned as: otp_ring0:start/2Run queue: 0Spawned by: []Started: Tue Nov 18 02:52:35 2014Message queue length: 0Number of heap fragments: 0Heap fragment data: 0Link list: [<0.3.0>, <0.7.0>, <0.6.0>]Reductions: 29265Stack+heap: 1598OldHeap: 610Heap unused: 656OldHeap unused: 468Memory: 18584Program counter: 0x00007f42f9566200 (init:boot_loop/2 + 64)CP: 0x0000000000000000 (invalid)=proc:<0.3.0>State: Waiting...=port:#Port<0.0>Slot: 0Connected: <0.3.0>Links: <0.3.0>Port controls linked-in driver: efile=port:#Port<0.14>Slot: 112Connected: <0.3.0>...

To yield the following result:

$ awk -f queue_fun.awk $PATH_TO_DUMPMESSAGE QUEUE LENGTH: CURRENT FUNCTION======================================10641: io:wait_io_mon_reply/212646: io:wait_io_mon_reply/232991: io:wait_io_mon_reply/22183837: io:wait_io_mon_reply/2730790: io:wait_io_mon_reply/280194: io:wait_io_mon_reply/2...

Which is a list of functions running in Erlang processes that causedmailboxes to be too large. Here's thescript:

Can you follow along? If so, you can understand Awk. Congratulations.


A detail description of awk can found at:https://www.gnu.org/software/gawk/manual/html_node/index.html#SEC_Contents

  • Foreword
  • Preface
    • History ofawk and gawk
    • A Rose by Any Other Name
    • Using This Book
    • Typographical Conventions
      • Dark Corners
    • The GNU Project and This Book
    • How to Contribute
    • Acknowledgments
  • 1 Getting Started withawk
    • 1.1 How to Runawk Programs
      • 1.1.1 One-Shot Throwawayawk Programs
      • 1.1.2 Runningawk Without Input Files
      • 1.1.3 Running Long Programs
      • 1.1.4 Executableawk Programs
      • 1.1.5 Comments inawk Programs
      • 1.1.6 Shell-Quoting Issues
        • 1.1.6.1 Quoting in MS-Windows Batch Files
    • 1.2 Data Files for the Examples
    • 1.3 Some Simple Examples
    • 1.4 An Example with Two Rules
    • 1.5 A More Complex Example
    • 1.6awk Statements Versus Lines
    • 1.7 Other Features ofawk
    • 1.8 When to Useawk
  • 2 Runningawk and gawk
    • 2.1 Invokingawk
    • 2.2 Command-Line Options
    • 2.3 Other Command-Line Arguments
    • 2.4 Naming Standard Input
    • 2.5 The Environment Variablesgawk Uses
      • 2.5.1 TheAWKPATH Environment Variable
      • 2.5.2 TheAWKLIBPATH Environment Variable
      • 2.5.3 Other Environment Variables
    • 2.6gawk’s Exit Status
    • 2.7 Including Other Files Into Your Program
    • 2.8 Loading Shared Libraries Into Your Program
    • 2.9 Obsolete Options and/or Features
    • 2.10 Undocumented Options and Features
  • 3 Regular Expressions
    • 3.1 How to Use Regular Expressions
    • 3.2 Escape Sequences
    • 3.3 Regular Expression Operators
    • 3.4 Using Bracket Expressions
    • 3.5gawk-Specific Regexp Operators
    • 3.6 Case Sensitivity in Matching
    • 3.7 How Much Text Matches?
    • 3.8 Using Dynamic Regexps
  • 4 Reading Input Files
    • 4.1 How Input Is Split into Records
    • 4.2 Examining Fields
    • 4.3 Nonconstant Field Numbers
    • 4.4 Changing the Contents of a Field
    • 4.5 Specifying How Fields Are Separated
      • 4.5.1 Whitespace Normally Separates Fields
      • 4.5.2 Using Regular Expressions to Separate Fields
      • 4.5.3 Making Each Character a Separate Field
      • 4.5.4 SettingFS from the Command Line
      • 4.5.5 Making The Full Line Be A Single Field
      • 4.5.6 Field-Splitting Summary
    • 4.6 Reading Fixed-Width Data
    • 4.7 Defining Fields By Content
    • 4.8 Multiple-Line Records
    • 4.9 Explicit Input withgetline
      • 4.9.1 Usinggetline with No Arguments
      • 4.9.2 Usinggetline into a Variable
      • 4.9.3 Usinggetline from a File
      • 4.9.4 Usinggetline into a Variable from a File
      • 4.9.5 Usinggetline from a Pipe
      • 4.9.6 Usinggetline into a Variable from a Pipe
      • 4.9.7 Usinggetline from a Coprocess
      • 4.9.8 Usinggetline into a Variable from a Coprocess
      • 4.9.9 Points to Remember Aboutgetline
      • 4.9.10 Summary ofgetline Variants
    • 4.10 Reading Input With A Timeout
    • 4.11 Directories On The Command Line
  • 5 Printing Output
    • 5.1 Theprint Statement
    • 5.2print Statement Examples
    • 5.3 Output Separators
    • 5.4 Controlling Numeric Output withprint
    • 5.5 Usingprintf Statements for Fancier Printing
      • 5.5.1 Introduction to theprintf Statement
      • 5.5.2 Format-Control Letters
      • 5.5.3 Modifiers forprintf Formats
      • 5.5.4 Examples Usingprintf
    • 5.6 Redirecting Output ofprint and printf
    • 5.7 Special File Names ingawk
      • 5.7.1 Special Files for Standard Descriptors
      • 5.7.2 Special Files for Network Communications
      • 5.7.3 Special File Name Caveats
    • 5.8 Closing Input and Output Redirections
  • 6 Expressions
    • 6.1 Constants, Variables and Conversions
      • 6.1.1 Constant Expressions
        • 6.1.1.1 Numeric and String Constants
        • 6.1.1.2 Octal and Hexadecimal Numbers
        • 6.1.1.3 Regular Expression Constants
      • 6.1.2 Using Regular Expression Constants
      • 6.1.3 Variables
        • 6.1.3.1 Using Variables in a Program
        • 6.1.3.2 Assigning Variables on the Command Line
      • 6.1.4 Conversion of Strings and Numbers
    • 6.2 Operators: Doing Something With Values
      • 6.2.1 Arithmetic Operators
      • 6.2.2 String Concatenation
      • 6.2.3 Assignment Expressions
      • 6.2.4 Increment and Decrement Operators
    • 6.3 Truth Values and Conditions
      • 6.3.1 True and False inawk
      • 6.3.2 Variable Typing and Comparison Expressions
        • 6.3.2.1 String Type Versus Numeric Type
        • 6.3.2.2 Comparison Operators
        • 6.3.2.3 String Comparison With POSIX Rules
      • 6.3.3 Boolean Expressions
      • 6.3.4 Conditional Expressions
    • 6.4 Function Calls
    • 6.5 Operator Precedence (How Operators Nest)
    • 6.6 Where You Are Makes A Difference
  • 7 Patterns, Actions, and Variables
    • 7.1 Pattern Elements
      • 7.1.1 Regular Expressions as Patterns
      • 7.1.2 Expressions as Patterns
      • 7.1.3 Specifying Record Ranges with Patterns
      • 7.1.4 TheBEGIN and END Special Patterns
        • 7.1.4.1 Startup and Cleanup Actions
        • 7.1.4.2 Input/Output fromBEGIN and END Rules
      • 7.1.5 TheBEGINFILE and ENDFILE Special Patterns
      • 7.1.6 The Empty Pattern
    • 7.2 Using Shell Variables in Programs
    • 7.3 Actions
    • 7.4 Control Statements in Actions
      • 7.4.1 Theif-else Statement
      • 7.4.2 Thewhile Statement
      • 7.4.3 Thedo-while Statement
      • 7.4.4 Thefor Statement
      • 7.4.5 Theswitch Statement
      • 7.4.6 Thebreak Statement
      • 7.4.7 Thecontinue Statement
      • 7.4.8 Thenext Statement
      • 7.4.9 Thenextfile Statement
      • 7.4.10 Theexit Statement
    • 7.5 Built-in Variables
      • 7.5.1 Built-in Variables That Controlawk
      • 7.5.2 Built-in Variables That Convey Information
      • 7.5.3 UsingARGC and ARGV
  • 8 Arrays inawk
    • 8.1 The Basics of Arrays
      • 8.1.1 Introduction to Arrays
      • 8.1.2 Referring to an Array Element
      • 8.1.3 Assigning Array Elements
      • 8.1.4 Basic Array Example
      • 8.1.5 Scanning All Elements of an Array
      • 8.1.6 Using Predefined Array Scanning Orders
    • 8.2 Thedelete Statement
    • 8.3 Using Numbers to Subscript Arrays
    • 8.4 Using Uninitialized Variables as Subscripts
    • 8.5 Multidimensional Arrays
      • 8.5.1 Scanning Multidimensional Arrays
    • 8.6 Arrays of Arrays
  • 9 Functions
    • 9.1 Built-in Functions
      • 9.1.1 Calling Built-in Functions
      • 9.1.2 Numeric Functions
      • 9.1.3 String-Manipulation Functions
        • 9.1.3.1 More About ‘\’ and ‘&’ with sub(), gsub(), and gensub()
      • 9.1.4 Input/Output Functions
      • 9.1.5 Time Functions
      • 9.1.6 Bit-Manipulation Functions
      • 9.1.7 Getting Type Information
      • 9.1.8 String-Translation Functions
    • 9.2 User-Defined Functions
      • 9.2.1 Function Definition Syntax
      • 9.2.2 Function Definition Examples
      • 9.2.3 Calling User-Defined Functions
        • 9.2.3.1 Writing A Function Call
        • 9.2.3.2 Controlling Variable Scope
        • 9.2.3.3 Passing Function Arguments By Value Or By Reference
      • 9.2.4 Thereturn Statement
      • 9.2.5 Functions and Their Effects on Variable Typing
    • 9.3 Indirect Function Calls
  • 10 A Library ofawk Functions
    • 10.1 Naming Library Function Global Variables
    • 10.2 General Programming
      • 10.2.1 Converting Strings To Numbers
      • 10.2.2 Assertions
      • 10.2.3 Rounding Numbers
      • 10.2.4 The Cliff Random Number Generator
      • 10.2.5 Translating Between Characters and Numbers
      • 10.2.6 Merging an Array into a String
      • 10.2.7 Managing the Time of Day
      • 10.2.8 Reading A Whole File At Once
    • 10.3 Data File Management
      • 10.3.1 Noting Data File Boundaries
      • 10.3.2 Rereading the Current File
      • 10.3.3 Checking for Readable Data Files
      • 10.3.4 Checking For Zero-length Files
      • 10.3.5 Treating Assignments as File Names
    • 10.4 Processing Command-Line Options
    • 10.5 Reading the User Database
    • 10.6 Reading the Group Database
    • 10.7 Traversing Arrays of Arrays
  • 11 Practicalawk Programs
    • 11.1 Running the Example Programs
    • 11.2 Reinventing Wheels for Fun and Profit
      • 11.2.1 Cutting out Fields and Columns
      • 11.2.2 Searching for Regular Expressions in Files
      • 11.2.3 Printing out User Information
      • 11.2.4 Splitting a Large File into Pieces
      • 11.2.5 Duplicating Output into Multiple Files
      • 11.2.6 Printing Nonduplicated Lines of Text
      • 11.2.7 Counting Things
    • 11.3 A Grab Bag ofawk Programs
      • 11.3.1 Finding Duplicated Words in a Document
      • 11.3.2 An Alarm Clock Program
      • 11.3.3 Transliterating Characters
      • 11.3.4 Printing Mailing Labels
      • 11.3.5 Generating Word-Usage Counts
      • 11.3.6 Removing Duplicates from Unsorted Text
      • 11.3.7 Extracting Programs from Texinfo Source Files
      • 11.3.8 A Simple Stream Editor
      • 11.3.9 An Easy Way to Use Library Functions
      • 11.3.10 Finding Anagrams From A Dictionary
      • 11.3.11 And Now For Something Completely Different
  • 12 Advanced Features ofgawk
    • 12.1 Allowing Nondecimal Input Data
    • 12.2 Controlling Array Traversal and Array Sorting
      • 12.2.1 Controlling Array Traversal
      • 12.2.2 Sorting Array Values and Indices withgawk
    • 12.3 Two-Way Communications with Another Process
    • 12.4 Usinggawk for Network Programming
    • 12.5 Profiling Yourawk Programs
  • 13 Internationalization withgawk
    • 13.1 Internationalization and Localization
    • 13.2 GNUgettext
    • 13.3 Internationalizingawk Programs
    • 13.4 Translatingawk Programs
      • 13.4.1 Extracting Marked Strings
      • 13.4.2 Rearrangingprintf Arguments
      • 13.4.3awk Portability Issues
    • 13.5 A Simple Internationalization Example
    • 13.6gawk Can Speak Your Language
  • 14 Debuggingawk Programs
    • 14.1 Introduction togawk Debugger
      • 14.1.1 Debugging in General
      • 14.1.2 Additional Debugging Concepts
      • 14.1.3 Awk Debugging
    • 14.2 Sample Debugging Session
      • 14.2.1 How to Start the Debugger
      • 14.2.2 Finding the Bug
    • 14.3 Main Debugger Commands
      • 14.3.1 Control of Breakpoints
      • 14.3.2 Control of Execution
      • 14.3.3 Viewing and Changing Data
      • 14.3.4 Dealing with the Stack
      • 14.3.5 Obtaining Information about the Program and the Debugger State
      • 14.3.6 Miscellaneous Commands
    • 14.4 Readline Support
    • 14.5 Limitations and Future Plans
  • 15 Arithmetic and Arbitrary Precision Arithmetic with gawk
    • 15.1 A General Description of Computer Arithmetic
      • 15.1.1 Floating-Point Number Caveats
        • 15.1.1.1 The String Value Can Lie
        • 15.1.1.2 Floating Point Numbers Are Not Abstract Numbers
        • 15.1.1.3 Standards Versus Existing Practice
      • 15.1.2 Mixing Integers And Floating-point
    • 15.2 Understanding Floating-point Programming
      • 15.2.1 Binary Floating-point Representation
      • 15.2.2 Floating-point Context
      • 15.2.3 Floating-point Rounding Mode
    • 15.3gawk + MPFR = Powerful Arithmetic
    • 15.4 Arbitrary Precision Floating-point Arithmetic with gawk
      • 15.4.1 Setting the Working Precision
      • 15.4.2 Setting the Rounding Mode
      • 15.4.3 Representing Floating-point Constants
      • 15.4.4 Changing the Precision of a Number
      • 15.4.5 Exact Arithmetic with Floating-point Numbers
    • 15.5 Arbitrary Precision Integer Arithmetic withgawk
  • 16 Writing Extensions forgawk
    • 16.1 Introduction
    • 16.2 Extension Licensing
    • 16.3 At A High Level How It Works
    • 16.4 API Description
      • 16.4.1 Introduction
      • 16.4.2 General Purpose Data Types
      • 16.4.3 Requesting Values
      • 16.4.4 Memory Allocation Functions and Convenience Macros
      • 16.4.5 Constructor Functions
      • 16.4.6 Registration Functions
        • 16.4.6.1 Registering An Extension Function
        • 16.4.6.2 Registering An Exit Callback Function
        • 16.4.6.3 Registering An Extension Version String
        • 16.4.6.4 Customized Input Parsers
        • 16.4.6.5 Customized Output Wrappers
        • 16.4.6.6 Customized Two-way Processors
      • 16.4.7 Printing Messages
      • 16.4.8 UpdatingERRNO
      • 16.4.9 Accessing and Updating Parameters
      • 16.4.10 Symbol Table Access
        • 16.4.10.1 Variable Access and Update by Name
        • 16.4.10.2 Variable Access and Update by Cookie
        • 16.4.10.3 Creating and Using Cached Values
      • 16.4.11 Array Manipulation
        • 16.4.11.1 Array Data Types
        • 16.4.11.2 Array Functions
        • 16.4.11.3 Working With All The Elements of an Array
        • 16.4.11.4 How To Create and Populate Arrays
      • 16.4.12 API Variables
        • 16.4.12.1 API Version Constants and Variables
        • 16.4.12.2 Informational Variables
      • 16.4.13 Boilerplate Code
    • 16.5 Howgawk Finds Extensions
    • 16.6 Example: Some File Functions
      • 16.6.1 Usingchdir() and stat()
      • 16.6.2 C Code forchdir() and stat()
      • 16.6.3 Integrating The Extensions
    • 16.7 The Sample Extensions In Thegawk Distribution
      • 16.7.1 File Related Functions
      • 16.7.2 Interface Tofnmatch()
      • 16.7.3 Interface Tofork(), wait() and waitpid()
      • 16.7.4 Enabling In-Place File Editing
      • 16.7.5 Character and Numeric values:ord() and chr()
      • 16.7.6 Reading Directories
      • 16.7.7 Reversing Output
      • 16.7.8 Two-Way I/O Example
      • 16.7.9 Dumping and Restoring An Array
      • 16.7.10 Reading An Entire File
      • 16.7.11 API Tests
      • 16.7.12 Extension Time Functions
    • 16.8 Thegawkextlib Project
  • Appendix A The Evolution of theawk Language
    • A.1 Major Changes Between V7 and SVR3.1
    • A.2 Changes Between SVR3.1 and SVR4
    • A.3 Changes Between SVR4 and POSIXawk
    • A.4 Extensions in Brian Kernighan’sawk
    • A.5 Extensions ingawk Not in POSIX awk
    • A.6 History ofgawk Features
    • A.7 Common Extensions Summary
    • A.8 Regexp Ranges and Locales: A Long Sad Story
    • A.9 Major Contributors togawk
  • Appendix B Installinggawk
    • B.1 Thegawk Distribution
      • B.1.1 Getting thegawk Distribution
      • B.1.2 Extracting the Distribution
      • B.1.3 Contents of thegawk Distribution
    • B.2 Compiling and Installinggawk on Unix-like Systems
      • B.2.1 Compilinggawk for Unix-like Systems
      • B.2.2 Additional Configuration Options
      • B.2.3 The Configuration Process
    • B.3 Installation on Other Operating Systems
      • B.3.1 Installation on PC Operating Systems
        • B.3.1.1 Installing a Prepared Distribution for PC Systems
        • B.3.1.2 Compilinggawk for PC Operating Systems
        • B.3.1.3 Testinggawk on PC Operating Systems
        • B.3.1.4 Usinggawk on PC Operating Systems
        • B.3.1.5 Usinggawk In The Cygwin Environment
        • B.3.1.6 Usinggawk In The MSYS Environment
      • B.3.2 How to Compile and Installgawk on VMS
        • B.3.2.1 Compilinggawk on VMS
        • B.3.2.2 Compilinggawk Dynamic Extensions on VMS
        • B.3.2.3 Installinggawk on VMS
        • B.3.2.4 Runninggawk on VMS
        • B.3.2.5 The VMS GNV Project
        • B.3.2.6 Some VMS Systems Have An Old Version ofgawk
    • B.4 Reporting Problems and Bugs
    • B.5 Other Freely Availableawk Implementations
  • Appendix C Implementation Notes
    • C.1 Downward Compatibility and Debugging
    • C.2 Making Additions togawk
      • C.2.1 Accessing Thegawk Git Repository
      • C.2.2 Adding New Features
      • C.2.3 Portinggawk to a New Operating System
      • C.2.4 Why Generated Files Are Kept Ingit
    • C.3 Probable Future Extensions
    • C.4 Some Limitations of the Implementation
    • C.5 Extension API Design
      • C.5.1 Problems With The Old Mechanism
      • C.5.2 Goals For A New Mechanism
      • C.5.3 Other Design Decisions
      • C.5.4 Room For Future Growth
    • C.6 Compatibility For Old Extensions
  • Appendix D Basic Programming Concepts
    • D.1 What a Program Does
    • D.2 Data Values in a Computer
  • Glossary
  • GNU General Public License
  • GNU Free Documentation License
    • ADDENDUM: How to use this License for your documents

1 0