Top (The GNU Awk User’s Guide)

From Get docs

The GNU Awk User’s Guide

Next: Foreword3, Up: (dir)   [Contents][Index]

General Introduction

This file documents awk, a program that you can use to select particular records in a file and perform operations upon them.

Copyright © 1989, 1991, 1992, 1993, 1996–2005, 2007, 2009–2020
Free Software Foundation, Inc.

This is Edition 5.1 of GAWK: Effective AWK Programming: A User’s Guide for GNU Awk, for the 5.1.0 (or later) version of the GNU implementation of AWK.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being “GNU General Public License”, with the Front-Cover Texts being “A GNU Manual”, and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled “GNU Free Documentation License”.

  1. The FSF’s Back-Cover Text is: “You have the freedom to copy and modify this GNU manual.”
Foreword3    Some nice words about this Web page.
Foreword4    More nice words.
Preface    What this Web page is about; brief history and acknowledgments.
Getting Started    A basic introduction to using awk. How to run an awk program. Command-line syntax.
Invoking Gawk    How to run gawk.
Regexp    All about matching things using regular expressions.
Reading Files    How to read files and manipulate fields.
Printing    How to print using awk. Describes the print and printf statements. Also describes redirection of output.
Expressions    Expressions are the basic building blocks of statements.
Patterns and Actions    Overviews of patterns and actions.
Arrays    The description and use of arrays. Also includes array-oriented control statements.
Functions    Built-in and user-defined functions.
Library Functions    A Library of awk Functions.
Sample Programs    Many awk programs with complete explanations.
Advanced Features    Stuff for advanced users, specific to gawk.
Internationalization    Getting gawk to speak your language.
Debugger    The gawk debugger.
Namespaces    How namespaces work in gawk.
Arbitrary Precision Arithmetic    Arbitrary precision arithmetic with gawk.
Dynamic Extensions    Adding new built-in functions to gawk.
Language History    The evolution of the awk language.
Installation    Installing gawk under various operating systems.
Notes    Notes about adding things to gawk and possible future work.
Basic Concepts    A very quick introduction to programming concepts.
Glossary    An explanation of some unfamiliar terms.
Copying    Your right to copy and distribute gawk.
GNU Free Documentation License    The license for this Web page.
Index    Concept and Variable Index.
History    The history of gawk and awk.
Names    What name to use to find awk.
This Manual    Using this Web page. Includes sample input files that you can use.
Conventions    Typographical Conventions.
Manual History    Brief history of the GNU project and this Web page.
How To Contribute    Helping to save the world.
Acknowledgments    Acknowledgments.
Running gawk    How to run gawk programs; includes command-line syntax.
One-shot    Running a short throwaway awk program.
Read Terminal    Using no input files (input from the keyboard instead).
Long    Putting permanent awk programs in files.
Executable Scripts    Making self-contained awk programs.
Comments    Adding documentation to gawk programs.
Quoting    More discussion of shell quoting issues.
DOS Quoting    Quoting in Windows Batch Files.
Sample Data Files    Sample data files for use in the awk programs illustrated in this Web page.
Very Simple    A very simple example.
Two Rules    A less simple one-line example using two rules.
More Complex    A more complex example.
Statements/Lines    Subdividing or combining statements into lines.
Other Features    Other Features of awk.
When    When to use gawk and when to use other things.
Intro Summary    Summary of the introduction.
Command Line    How to run awk.
Options    Command-line options and their meanings.
Other Arguments    Input file names and variable assignments.
Naming Standard Input    How to specify standard input with other files.
Environment Variables    The environment variables gawk uses.
AWKPATH Variable    Searching directories for awk programs.
AWKLIBPATH Variable    Searching directories for awk shared libraries.
Other Environment Variables    The environment variables.
Exit Status    gawk’s exit status.
Include Files    Including other files into your program.
Loading Shared Libraries    Loading shared libraries into your program.
Obsolete    Obsolete Options and/or features.
Undocumented    Undocumented Options and Features.
Invoking Summary    Invocation summary.
Regexp Usage    How to Use Regular Expressions.
Escape Sequences    How to write nonprinting characters.
Regexp Operators    Regular Expression Operators.
Regexp Operator Details    The actual details.
Interval Expressions    Notes on interval expressions.
Bracket Expressions    What can go between ‘[...]’.
Leftmost Longest    How much text matches.
Computed Regexps    Using Dynamic Regexps.
GNU Regexp Operators    Operators specific to GNU software.
Case-sensitivity    How to do case-insensitive matching.
Regexp Summary    Regular expressions summary.
Records    Controlling how data is split into records.
awk split records    How standard awk splits records.
gawk split records    How gawk splits records.
Fields    An introduction to fields.
Nonconstant Fields    Nonconstant Field Numbers.
Changing Fields    Changing the Contents of a Field.
Field Separators    The field separator and how to change it.
Default Field Splitting    How fields are normally separated.
Regexp Field Splitting    Using regexps as the field separator.
Single Character Fields    Making each character a separate field.
Command Line Field Separator    Setting FS from the command line.
Full Line Fields    Making the full line be a single field.
Field Splitting Summary    Some final points and a summary table.
Constant Size    Reading constant width data.
Fixed width data    Processing fixed-width data.
Skipping intervening    Skipping intervening fields.
Allowing trailing data    Capturing optional trailing data.
Fields with fixed data    Field values with fixed-width data.
Splitting By Content    Defining Fields By Content
More CSV    More on CSV files.
Testing field creation    Checking how gawk is splitting records.
Multiple Line    Reading multiline records.
Getline    Reading files under explicit program control using the getline function.
Plain Getline    Using getline with no arguments.
Getline/Variable    Using getline into a variable.
Getline/File    Using getline from a file.
Getline/Variable/File    Using getline into a variable from a file.
Getline/Pipe    Using getline from a pipe.
Getline/Variable/Pipe    Using getline into a variable from a pipe.
Getline/Coprocess    Using getline from a coprocess.
Getline/Variable/Coprocess    Using getline into a variable from a coprocess.
Getline Notes    Important things to know about getline.
Getline Summary    Summary of getline Variants.
Read Timeout    Reading input with a timeout.
Retrying Input    Retrying input after certain errors.
Command-line directories    What happens if you put a directory on the command line.
Input Summary    Input summary.
Input Exercises    Exercises.
Print    The print statement.
Print Examples    Simple examples of print statements.
Output Separators    The output separators and how to change them.
OFMT    Controlling Numeric Output With print.
Printf    The printf statement.
Basic Printf    Syntax of the printf statement.
Control Letters    Format-control letters.
Format Modifiers    Format-specification modifiers.
Printf Examples    Several examples.
Redirection    How to redirect output to multiple files and pipes.
Special FD    Special files for I/O.
Special Files    File name interpretation in gawk. gawk allows access to inherited file descriptors.
Other Inherited Files    Accessing other open files with gawk.
Special Network    Special files for network communications.
Special Caveats    Things to watch out for.
Close Files And Pipes    Closing Input and Output Files and Pipes.
Nonfatal    Enabling Nonfatal Output.
Output Summary    Output summary.
Output Exercises    Exercises.
Values    Constants, Variables, and Regular Expressions.
Constants    String, numeric and regexp constants.
Scalar Constants    Numeric and string constants.
Nondecimal-numbers    What are octal and hex numbers.
Regexp Constants    Regular Expression constants.
Using Constant Regexps    When and how to use a regexp constant.
Standard Regexp Constants    Regexp constants in standard awk.
Strong Regexp Constants    Strongly typed regexp constants.
Variables    Variables give names to values for later use.
Using Variables    Using variables in your programs.
Assignment Options    Setting variables on the command line and a summary of command-line syntax. This is an advanced method of input.
Conversion    The conversion of strings to numbers and vice versa.
Strings And Numbers    How awk Converts Between Strings And Numbers.
Locale influences conversions    How the locale may affect conversions.
All Operators    gawk’s operators.
Arithmetic Ops    Arithmetic operations (‘+’, ‘-’, etc.)
Concatenation    Concatenating strings.
Assignment Ops    Changing the value of a variable or a field.
Increment Ops    Incrementing the numeric value of a variable.
Truth Values and Conditions    Testing for true and false.
Truth Values    What is “true” and what is “false”.
Typing and Comparison    How variables acquire types and how this affects comparison of numbers and strings with ‘<’, etc.
Variable Typing    String type versus numeric type.
Comparison Operators    The comparison operators.
POSIX String Comparison    String comparison with POSIX rules.
Boolean Ops    Combining comparison expressions using boolean operators ‘ ’ (“or”), ‘&&’ (“and”) and ‘!’ (“not”).
Conditional Exp    Conditional expressions select between two subexpressions under control of a third subexpression.
Function Calls    A function call is an expression.
Precedence    How various operators nest.
Locales    How the locale affects things.
Expressions Summary    Expressions summary.
Pattern Overview    What goes into a pattern.
Regexp Patterns    Using regexps as patterns.
Expression Patterns    Any expression can be used as a pattern.
Ranges    Pairs of patterns specify record ranges.
BEGIN/END    Specifying initialization and cleanup rules.
Using BEGIN/END    How and why to use BEGIN/END rules.
I/O And BEGIN/END    I/O issues in BEGIN/END rules.
BEGINFILE/ENDFILE    Two special patterns for advanced control.
Empty    The empty pattern, which matches every record.
Using Shell Variables    How to use shell variables with awk.
Action Overview    What goes into an action.
Statements    Describes the various control statements in detail.
If Statement    Conditionally execute some awk statements.
While Statement    Loop until some condition is satisfied.
Do Statement    Do specified action while looping until some condition is satisfied.
For Statement    Another looping statement, that provides initialization and increment clauses.
Switch Statement    Switch/case evaluation for conditional execution of statements based on a value.
Break Statement    Immediately exit the innermost enclosing loop.
Continue Statement    Skip to the end of the innermost enclosing loop.
Next Statement    Stop processing the current input record.
Nextfile Statement    Stop processing the current file.
Exit Statement    Stop execution of awk.
Built-in Variables    Summarizes the predefined variables.
User-modified    Built-in variables that you change to control awk.
Auto-set    Built-in variables where awk gives you information.
ARGC and ARGV    Ways to use ARGC and ARGV.
Pattern Action Summary    Patterns and Actions summary.
Array Basics    The basics of arrays.
Array Intro    Introduction to Arrays
Reference to Elements    How to examine one element of an array.
Assigning Elements    How to change an element of an array.
Array Example    Basic Example of an Array
Scanning an Array    A variation of the for statement. It loops through the indices of an array’s existing elements.
Controlling Scanning    Controlling the order in which arrays are scanned.
Numeric Array Subscripts    How to use numbers as subscripts in awk.
Uninitialized Subscripts    Using Uninitialized variables as subscripts.
Delete    The delete statement removes an element from an array.
Multidimensional    Emulating multidimensional arrays in awk.
Multiscanning    Scanning multidimensional arrays.
Arrays of Arrays    True multidimensional arrays.
Arrays Summary    Summary of arrays.
Built-in    Summarizes the built-in functions.
Calling Built-in    How to call built-in functions.
Numeric Functions    Functions that work with numbers, including int(), sin() and rand().
String Functions    Functions for string manipulation, such as split(), match() and sprintf().
Gory Details    More than you want to know about ‘\’ and ‘&’ with sub(), gsub(), and gensub().
I/O Functions    Functions for files and shell commands.
Time Functions    Functions for dealing with timestamps.
Bitwise Functions    Functions for bitwise operations.
Type Functions    Functions for type information.
I18N Functions    Functions for string translation.
User-defined    Describes User-defined functions in detail.
Definition Syntax    How to write definitions and what they mean.
Function Example    An example function definition and what it does.
Function Calling    Calling user-defined functions.
Calling A Function    Don’t use spaces.
Variable Scope    Controlling variable scope.
Pass By Value/Reference    Passing parameters.
Function Caveats    Other points to know about functions.
Return Statement    Specifying the value a function returns.
Dynamic Typing    How variable types can change at runtime.
Indirect Calls    Choosing the function to call at runtime.
Functions Summary    Summary of functions.
Library Names    How to best name private global variables in library functions.
General Functions    Functions that are of general use.
Strtonum Function    A replacement for the built-in strtonum() function.
Assert Function    A function for assertions in awk programs.
Round Function    A function for rounding if sprintf() does not do it correctly.
Cliff Random Function    The Cliff Random Number Generator.
Ordinal Functions    Functions for using characters as numbers and vice versa.
Join Function    A function to join an array into a string.
Getlocaltime Function    A function to get formatted times.
Readfile Function    A function to read an entire file at once.
Shell Quoting    A function to quote strings for the shell.
Data File Management    Functions for managing command-line data files.
Filetrans Function    A function for handling data file transitions.
Rewind Function    A function for rereading the current file.
File Checking    Checking that data files are readable.
Empty Files    Checking for zero-length files.
Ignoring Assigns    Treating assignments as file names.
Getopt Function    A function for processing command-line arguments.
Passwd Functions    Functions for getting user information.
Group Functions    Functions for getting group information.
Walking Arrays    A function to walk arrays of arrays.
Library Functions Summary    Summary of library functions.
Library Exercises    Exercises.
Running Examples    How to run these examples.
Clones    Clones of common utilities.
Cut Program    The cut utility.
Egrep Program    The egrep utility.
Id Program    The id utility.
Split Program    The split utility.
Tee Program    The tee utility.
Uniq Program    The uniq utility.
Wc Program    The wc utility.
Miscellaneous Programs    Some interesting awk programs.
Dupword Program    Finding duplicated words in a document.
Alarm Program    An alarm clock.
Translate Program    A program similar to the tr utility.
Labels Program    Printing mailing labels.
Word Sorting    A program to produce a word usage count.
History Sorting    Eliminating duplicate entries from a history file.
Extract Program    Pulling out programs from Texinfo source files.
Simple Sed    A Simple Stream Editor.
Igawk Program    A wrapper for awk that includes files.
Anagram Program    Finding anagrams from a dictionary.
Signature Program    People do amazing things with too much time on their hands.
Programs Summary    Summary of programs.
Programs Exercises    Exercises.
Nondecimal Data    Allowing nondecimal input data.
Array Sorting    Facilities for controlling array traversal and sorting arrays.
Controlling Array Traversal    How to use PROCINFO["sorted_in"].
Array Sorting Functions    How to use asort() and asorti().
Two-way I/O    Two-way communications with another process.
TCP/IP Networking    Using gawk for network programming.
Profiling    Profiling your awk programs.
Advanced Features Summary    Summary of advanced features.
I18N and L10N    Internationalization and Localization.
Explaining gettext    How GNU gettext works.
Programmer i18n    Features for the programmer.
Translator i18n    Features for the translator.
String Extraction    Extracting marked strings.
Printf Ordering    Rearranging printf arguments.
I18N Portability    awk-level portability issues.
I18N Example    A simple i18n example.
Gawk I18N    gawk is also internationalized.
I18N Summary    Summary of I18N stuff.
Debugging    Introduction to gawk debugger.
Debugging Concepts    Debugging in General.
Debugging Terms    Additional Debugging Concepts.
Awk Debugging    Awk Debugging.
Sample Debugging Session    Sample debugging session.
Debugger Invocation    How to Start the Debugger.
Finding The Bug    Finding the Bug.
List of Debugger Commands    Main debugger commands.
Breakpoint Control    Control of Breakpoints.
Debugger Execution Control    Control of Execution.
Viewing And Changing Data    Viewing and Changing Data.
Execution Stack    Dealing with the Stack.
Debugger Info    Obtaining Information about the Program and the Debugger State.
Miscellaneous Debugger Commands    Miscellaneous Commands.
Readline Support    Readline support.
Limitations    Limitations and future plans.
Debugging Summary    Debugging summary.
Global Namespace    The global namespace in standard awk.
Qualified Names    How to qualify names with a namespace.
Default Namespace    The default namespace.
Changing The Namespace    How to change the namespace.
Naming Rules    Namespace and Component Naming Rules.
Internal Name Management    How names are stored internally.
Namespace Example    An example of code using a namespace.
Namespace And Features    Namespaces and other gawk features.
Namespace Summary    Summarizing namespaces.
Computer Arithmetic    A quick intro to computer math.
Math Definitions    Defining terms used.
MPFR features    The MPFR features in gawk.
FP Math Caution    Things to know.
Inexactness of computations    Floating point math is not exact.
Inexact representation    Numbers are not exactly represented.
Comparing FP Values    How to compare floating point values.
Errors accumulate    Errors get bigger as they go.
Getting Accuracy    Getting more accuracy takes some work.
Try To Round    Add digits and round.
Setting precision    How to set the precision.
Setting the rounding mode    How to set the rounding mode.
Arbitrary Precision Integers    Arbitrary Precision Integer Arithmetic with gawk.
Checking for MPFR    How to check if MPFR is available.
POSIX Floating Point Problems    Standards Versus Existing Practice.
Floating point summary    Summary of floating point discussion.
Extension Intro    What is an extension.
Plugin License    A note about licensing.
Extension Mechanism Outline    An outline of how it works.
Extension API Description    A full description of the API.
Extension API Functions Introduction    Introduction to the API functions.
General Data Types    The data types.
Memory Allocation Functions    Functions for allocating memory.
Constructor Functions    Functions for creating values.
Registration Functions    Functions to register things with gawk.
Extension Functions    Registering extension functions.
Exit Callback Functions    Registering an exit callback.
Extension Version String    Registering a version string.
Input Parsers    Registering an input parser.
Output Wrappers    Registering an output wrapper.
Two-way processors    Registering a two-way processor.
Printing Messages    Functions for printing messages.
Updating ERRNO    Functions for updating ERRNO.
Requesting Values    How to get a value.
Accessing Parameters    Functions for accessing parameters.
Symbol Table Access    Functions for accessing global variables.
Symbol table by name    Accessing variables by name.
Symbol table by cookie    Accessing variables by “cookie”.
Cached values    Creating and using cached values.
Array Manipulation    Functions for working with arrays.
Array Data Types    Data types for working with arrays.
Array Functions    Functions for working with arrays.
Flattening Arrays    How to flatten arrays.
Creating Arrays    How to create and populate arrays.
Redirection API    How to access and manipulate redirections.
Extension API Variables    Variables provided by the API.
Extension Versioning    API Version information.
Extension GMP/MPFR Versioning    Version information about GMP and MPFR.
Extension API Informational Variables    Variables providing information about gawk’s invocation.
Extension API Boilerplate    Boilerplate code for using the API.
Changes from API V1    Changes from V1 of the API.
Finding Extensions    How gawk finds compiled extensions.
Extension Example    Example C code for an extension.
Internal File Description    What the new functions will do.
Internal File Ops    The code for internal file operations.
Using Internal File Ops    How to use an external extension.
Extension Samples    The sample extensions that ship with gawk.
Extension Sample File Functions    The file functions sample.
Extension Sample Fnmatch    An interface to fnmatch().
Extension Sample Fork    An interface to fork() and other process functions.
Extension Sample Inplace    Enabling in-place file editing.
Extension Sample Ord    Character to value to character conversions.
Extension Sample Readdir    An interface to readdir().
Extension Sample Revout    Reversing output sample output wrapper.
Extension Sample Rev2way    Reversing data sample two-way processor.
Extension Sample Read write array    Serializing an array to a file.
Extension Sample Readfile    Reading an entire file into a string.
Extension Sample Time    An interface to gettimeofday() and sleep().
Extension Sample API Tests    Tests for the API.
gawkextlib    The gawkextlib project.
Extension summary    Extension summary.
Extension Exercises    Exercises.
V7/SVR3.1    The major changes between V7 and System V Release 3.1.
SVR4    Minor changes between System V Releases 3.1 and 4.
POSIX    New features from the POSIX standard.
BTL    New features from Brian Kernighan’s version of awk.
POSIX/GNU    The extensions in gawk not in POSIX awk.
Feature History    The history of the features in gawk.
Common Extensions    Common Extensions Summary.
Ranges and Locales    How locales used to affect regexp ranges.
Contributors    The major contributors to gawk.
History summary    History summary.
Gawk Distribution    What is in the gawk distribution.
Getting    How to get the distribution.
Extracting    How to extract the distribution.
Distribution contents    What is in the distribution.
Unix Installation    Installing gawk under various versions of Unix.
Quick Installation    Compiling gawk under Unix.
Shell Startup Files    Shell convenience functions.
Additional Configuration Options    Other compile-time options.
Configuration Philosophy    How it’s all supposed to work.
Non-Unix Installation    Installation on Other Operating Systems.
PC Installation    Installing and Compiling gawk on Microsoft Windows.
PC Binary Installation    Installing a prepared distribution.
PC Compiling    Compiling gawk for Windows32.
PC Using    Running gawk on Windows32.
Cygwin    Building and running gawk for Cygwin.
MSYS    Using gawk In The MSYS Environment.
VMS Installation    Installing gawk on VMS.
VMS Compilation    How to compile gawk under VMS.
VMS Dynamic Extensions    Compiling gawk dynamic extensions on VMS.
VMS Installation Details    How to install gawk under VMS.
VMS Running    How to run gawk under VMS.
VMS GNV    The VMS GNV Project.
VMS Old Gawk    An old version comes with some VMS systems.
Bugs    Reporting Problems and Bugs.
Bug address    Where to send reports to.
Usenet    Where not to send reports to.
Maintainers    Maintainers of non-*nix ports.
Other Versions    Other freely available awk implementations.
Installation summary    Summary of installation.
Compatibility Mode    How to disable certain gawk extensions.
Additions    Making Additions To gawk.
Accessing The Source    Accessing the Git repository.
Adding Code    Adding code to the main body of gawk.
New Ports    Porting gawk to a new operating system.
Derived Files    Why derived files are kept in the Git repository.
Future Extensions    New features that may be implemented one day.
Implementation Limitations    Some limitations of the implementation.
Extension Design    Design notes about the extension API.
Old Extension Problems    Problems with the old mechanism.
Extension New Mechanism Goals    Goals for the new mechanism.
Extension Other Design Decisions    Some other design decisions.
Extension Future Growth    Some room for future growth.
Notes summary    Summary of implementation notes.
Basic High Level    The high level view.
Basic Data Typing    A very quick intro to data types.

Short Table of Contents

Table of Contents

Next: Foreword3, Up: (dir)   [Contents][Index]