From Get docs

The GNU Awk User’s Guide

Next: Foreword3, Up: (dir)   [Contents][Index]

General Introduction

This file documents awk, a program that you can use to select particular records in a file and perform operations upon them.

Copyright © 1989, 1991, 1992, 1993, 1996–2005, 2007, 2009–2020 Free Software Foundation, Inc.

This is Edition 5.1 of GAWK: Effective AWK Programming: A User’s Guide for GNU Awk, for the 5.1.0 (or later) version of the GNU implementation of AWK.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being “GNU General Public License”, with the Front-Cover Texts being “A GNU Manual”, and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled “GNU Free Documentation License”.

  1. The FSF’s Back-Cover Text is: “You have the freedom to copy and modify this GNU manual.”



Some nice words about this Web page.



More nice words.



What this Web page is about; brief history and acknowledgments.

Getting Started


A basic introduction to using awk. How to run an awk program. Command-line syntax.

Invoking Gawk


How to run gawk.



All about matching things using regular expressions.

Reading Files


How to read files and manipulate fields.



How to print using awk. Describes the print and printf statements. Also describes redirection of output.



Expressions are the basic building blocks of statements.

Patterns and Actions


Overviews of patterns and actions.



The description and use of arrays. Also includes array-oriented control statements.



Built-in and user-defined functions.

Library Functions


A Library of awk Functions.

Sample Programs


Many awk programs with complete explanations.

Advanced Features


Stuff for advanced users, specific to gawk.



Getting gawk to speak your language.



The gawk debugger.



How namespaces work in gawk.

Arbitrary Precision Arithmetic


Arbitrary precision arithmetic with gawk.

Dynamic Extensions


Adding new built-in functions to gawk.

Language History


The evolution of the awk language.



Installing gawk under various operating systems.



Notes about adding things to gawk and possible future work.

Basic Concepts


A very quick introduction to programming concepts.



An explanation of some unfamiliar terms.



Your right to copy and distribute gawk.

GNU Free Documentation License


The license for this Web page.



Concept and Variable Index.



The history of gawk and awk.



What name to use to find awk.

This Manual


Using this Web page. Includes sample input files that you can use.



Typographical Conventions.

Manual History


Brief history of the GNU project and this Web page.

How To Contribute


Helping to save the world.




Running gawk


How to run gawk programs; includes command-line syntax.



Running a short throwaway awk program.

Read Terminal


Using no input files (input from the keyboard instead).



Putting permanent awk programs in files.

Executable Scripts


Making self-contained awk programs.



Adding documentation to gawk programs.



More discussion of shell quoting issues.

DOS Quoting


Quoting in Windows Batch Files.

Sample Data Files


Sample data files for use in the awk programs illustrated in this Web page.

Very Simple


A very simple example.

Two Rules


A less simple one-line example using two rules.

More Complex


A more complex example.



Subdividing or combining statements into lines.

Other Features


Other Features of awk.



When to use gawk and when to use other things.

Intro Summary


Summary of the introduction.

Command Line


How to run awk.



Command-line options and their meanings.

Other Arguments


Input file names and variable assignments.

Naming Standard Input


How to specify standard input with other files.

Environment Variables


The environment variables gawk uses.

AWKPATH Variable


Searching directories for awk programs.



Searching directories for awk shared libraries.

Other Environment Variables


The environment variables.

Exit Status


gawk’s exit status.

Include Files


Including other files into your program.

Loading Shared Libraries


Loading shared libraries into your program.



Obsolete Options and/or features.



Undocumented Options and Features.

Invoking Summary


Invocation summary.

Regexp Usage


How to Use Regular Expressions.

Escape Sequences


How to write nonprinting characters.

Regexp Operators


Regular Expression Operators.

Regexp Operator Details


The actual details.

Interval Expressions


Notes on interval expressions.

Bracket Expressions


What can go between ‘[...]’.

Leftmost Longest


How much text matches.

Computed Regexps


Using Dynamic Regexps.

GNU Regexp Operators


Operators specific to GNU software.



How to do case-insensitive matching.

Regexp Summary


Regular expressions summary.



Controlling how data is split into records.

awk split records


How standard awk splits records.

gawk split records


How gawk splits records.



An introduction to fields.

Nonconstant Fields


Nonconstant Field Numbers.

Changing Fields


Changing the Contents of a Field.

Field Separators


The field separator and how to change it.

Default Field Splitting


How fields are normally separated.

Regexp Field Splitting


Using regexps as the field separator.

Single Character Fields


Making each character a separate field.

Command Line Field Separator


Setting FS from the command line.

Full Line Fields


Making the full line be a single field.

Field Splitting Summary


Some final points and a summary table.

Constant Size


Reading constant width data.

Fixed width data


Processing fixed-width data.

Skipping intervening


Skipping intervening fields.

Allowing trailing data


Capturing optional trailing data.

Fields with fixed data


Field values with fixed-width data.

Splitting By Content


Defining Fields By Content

More CSV


More on CSV files.

Testing field creation


Checking how gawk is splitting records.

Multiple Line


Reading multiline records.



Reading files under explicit program control using the getline function.

Plain Getline


Using getline with no arguments.



Using getline into a variable.



Using getline from a file.



Using getline into a variable from a file.



Using getline from a pipe.



Using getline into a variable from a pipe.



Using getline from a coprocess.



Using getline into a variable from a coprocess.

Getline Notes


Important things to know about getline.

Getline Summary


Summary of getline Variants.

Read Timeout


Reading input with a timeout.

Retrying Input


Retrying input after certain errors.

Command-line directories


What happens if you put a directory on the command line.

Input Summary


Input summary.

Input Exercises





The print statement.

Print Examples


Simple examples of print statements.

Output Separators


The output separators and how to change them.



Controlling Numeric Output With print.



The printf statement.

Basic Printf


Syntax of the printf statement.

Control Letters


Format-control letters.

Format Modifiers


Format-specification modifiers.

Printf Examples


Several examples.



How to redirect output to multiple files and pipes.

Special FD


Special files for I/O.

Special Files


File name interpretation in gawk. gawk allows access to inherited file descriptors.

Other Inherited Files


Accessing other open files with gawk.

Special Network


Special files for network communications.

Special Caveats


Things to watch out for.

Close Files And Pipes


Closing Input and Output Files and Pipes.



Enabling Nonfatal Output.

Output Summary


Output summary.

Output Exercises





Constants, Variables, and Regular Expressions.



String, numeric and regexp constants.

Scalar Constants


Numeric and string constants.



What are octal and hex numbers.

Regexp Constants


Regular Expression constants.

Using Constant Regexps


When and how to use a regexp constant.

Standard Regexp Constants


Regexp constants in standard awk.

Strong Regexp Constants


Strongly typed regexp constants.



Variables give names to values for later use.

Using Variables


Using variables in your programs.

Assignment Options


Setting variables on the command line and a summary of command-line syntax. This is an advanced method of input.



The conversion of strings to numbers and vice versa.

Strings And Numbers


How awk Converts Between Strings And Numbers.

Locale influences conversions


How the locale may affect conversions.

All Operators


gawk’s operators.

Arithmetic Ops


Arithmetic operations (‘+’, ‘-’, etc.)



Concatenating strings.

Assignment Ops


Changing the value of a variable or a field.

Increment Ops


Incrementing the numeric value of a variable.

Truth Values and Conditions


Testing for true and false.

Truth Values


What is “true” and what is “false”.

Typing and Comparison


How variables acquire types and how this affects comparison of numbers and strings with ‘<’, etc.

Variable Typing


String type versus numeric type.

Comparison Operators


The comparison operators.

POSIX String Comparison


String comparison with POSIX rules.

Boolean Ops


Combining comparison expressions using boolean operators ‘||’ (“or”), ‘&&’ (“and”) and ‘!’ (“not”).

Conditional Exp


Conditional expressions select between two subexpressions under control of a third subexpression.

Function Calls


A function call is an expression.



How various operators nest.



How the locale affects things.

Expressions Summary


Expressions summary.

Pattern Overview


What goes into a pattern.

Regexp Patterns


Using regexps as patterns.

Expression Patterns


Any expression can be used as a pattern.



Pairs of patterns specify record ranges.



Specifying initialization and cleanup rules.



How and why to use BEGIN/END rules.



I/O issues in BEGIN/END rules.



Two special patterns for advanced control.



The empty pattern, which matches every record.

Using Shell Variables


How to use shell variables with awk.

Action Overview


What goes into an action.



Describes the various control statements in detail.

If Statement


Conditionally execute some awk statements.

While Statement


Loop until some condition is satisfied.

Do Statement


Do specified action while looping until some condition is satisfied.

For Statement


Another looping statement, that provides initialization and increment clauses.

Switch Statement


Switch/case evaluation for conditional execution of statements based on a value.

Break Statement


Immediately exit the innermost enclosing loop.

Continue Statement


Skip to the end of the innermost enclosing loop.

Next Statement


Stop processing the current input record.

Nextfile Statement


Stop processing the current file.

Exit Statement


Stop execution of awk.

Built-in Variables


Summarizes the predefined variables.



Built-in variables that you change to control awk.



Built-in variables where awk gives you information.



Ways to use ARGC and ARGV.

Pattern Action Summary


Patterns and Actions summary.

Array Basics


The basics of arrays.

Array Intro


Introduction to Arrays

Reference to Elements


How to examine one element of an array.

Assigning Elements


How to change an element of an array.

Array Example


Basic Example of an Array

Scanning an Array


A variation of the for statement. It loops through the indices of an array’s existing elements.

Controlling Scanning


Controlling the order in which arrays are scanned.

Numeric Array Subscripts


How to use numbers as subscripts in awk.

Uninitialized Subscripts


Using Uninitialized variables as subscripts.



The delete statement removes an element from an array.



Emulating multidimensional arrays in awk.



Scanning multidimensional arrays.

Arrays of Arrays


True multidimensional arrays.

Arrays Summary


Summary of arrays.



Summarizes the built-in functions.

Calling Built-in


How to call built-in functions.

Numeric Functions


Functions that work with numbers, including int(), sin() and rand().

String Functions


Functions for string manipulation, such as split(), match() and sprintf().

Gory Details


More than you want to know about ‘\’ and ‘&’ with sub(), gsub(), and gensub().

I/O Functions


Functions for files and shell commands.

Time Functions


Functions for dealing with timestamps.

Bitwise Functions


Functions for bitwise operations.

Type Functions


Functions for type information.

I18N Functions


Functions for string translation.



Describes User-defined functions in detail.

Definition Syntax


How to write definitions and what they mean.

Function Example


An example function definition and what it does.

Function Calling


Calling user-defined functions.

Calling A Function


Don’t use spaces.

Variable Scope


Controlling variable scope.

Pass By Value/Reference


Passing parameters.

Function Caveats


Other points to know about functions.

Return Statement


Specifying the value a function returns.

Dynamic Typing


How variable types can change at runtime.

Indirect Calls


Choosing the function to call at runtime.

Functions Summary


Summary of functions.

Library Names


How to best name private global variables in library functions.

General Functions


Functions that are of general use.

Strtonum Function


A replacement for the built-in strtonum() function.

Assert Function


A function for assertions in awk programs.

Round Function


A function for rounding if sprintf() does not do it correctly.

Cliff Random Function


The Cliff Random Number Generator.

Ordinal Functions


Functions for using characters as numbers and vice versa.

Join Function


A function to join an array into a string.

Getlocaltime Function


A function to get formatted times.

Readfile Function


A function to read an entire file at once.

Shell Quoting


A function to quote strings for the shell.

Data File Management


Functions for managing command-line data files.

Filetrans Function


A function for handling data file transitions.

Rewind Function


A function for rereading the current file.

File Checking


Checking that data files are readable.

Empty Files


Checking for zero-length files.

Ignoring Assigns


Treating assignments as file names.

Getopt Function


A function for processing command-line arguments.

Passwd Functions


Functions for getting user information.

Group Functions


Functions for getting group information.

Walking Arrays


A function to walk arrays of arrays.

Library Functions Summary


Summary of library functions.

Library Exercises



Running Examples


How to run these examples.



Clones of common utilities.

Cut Program


The cut utility.

Egrep Program


The egrep utility.

Id Program


The id utility.

Split Program


The split utility.

Tee Program


The tee utility.

Uniq Program


The uniq utility.

Wc Program


The wc utility.

Miscellaneous Programs


Some interesting awk programs.

Dupword Program


Finding duplicated words in a document.

Alarm Program


An alarm clock.

Translate Program


A program similar to the tr utility.

Labels Program


Printing mailing labels.

Word Sorting


A program to produce a word usage count.

History Sorting


Eliminating duplicate entries from a history file.

Extract Program


Pulling out programs from Texinfo source files.

Simple Sed


A Simple Stream Editor.

Igawk Program


A wrapper for awk that includes files.

Anagram Program


Finding anagrams from a dictionary.

Signature Program


People do amazing things with too much time on their hands.

Programs Summary


Summary of programs.

Programs Exercises



Nondecimal Data


Allowing nondecimal input data.

Array Sorting


Facilities for controlling array traversal and sorting arrays.

Controlling Array Traversal


How to use PROCINFO["sorted_in"].

Array Sorting Functions


How to use asort() and asorti().

Two-way I/O


Two-way communications with another process.

TCP/IP Networking


Using gawk for network programming.



Profiling your awk programs.

Advanced Features Summary


Summary of advanced features.

I18N and L10N


Internationalization and Localization.

Explaining gettext


How GNU gettext works.

Programmer i18n


Features for the programmer.

Translator i18n


Features for the translator.

String Extraction


Extracting marked strings.

Printf Ordering


Rearranging printf arguments.

I18N Portability


awk-level portability issues.

I18N Example


A simple i18n example.

Gawk I18N


gawk is also internationalized.

I18N Summary


Summary of I18N stuff.



Introduction to gawk debugger.

Debugging Concepts


Debugging in General.

Debugging Terms


Additional Debugging Concepts.

Awk Debugging


Awk Debugging.

Sample Debugging Session


Sample debugging session.

Debugger Invocation


How to Start the Debugger.

Finding The Bug


Finding the Bug.

List of Debugger Commands


Main debugger commands.

Breakpoint Control


Control of Breakpoints.

Debugger Execution Control


Control of Execution.

Viewing And Changing Data


Viewing and Changing Data.

Execution Stack


Dealing with the Stack.

Debugger Info


Obtaining Information about the Program and the Debugger State.

Miscellaneous Debugger Commands


Miscellaneous Commands.

Readline Support


Readline support.



Limitations and future plans.

Debugging Summary


Debugging summary.

Global Namespace


The global namespace in standard awk.

Qualified Names


How to qualify names with a namespace.

Default Namespace


The default namespace.

Changing The Namespace


How to change the namespace.

Naming Rules


Namespace and Component Naming Rules.

Internal Name Management


How names are stored internally.

Namespace Example


An example of code using a namespace.

Namespace And Features


Namespaces and other gawk features.

Namespace Summary


Summarizing namespaces.

Computer Arithmetic


A quick intro to computer math.

Math Definitions


Defining terms used.

MPFR features


The MPFR features in gawk.

FP Math Caution


Things to know.

Inexactness of computations


Floating point math is not exact.

Inexact representation


Numbers are not exactly represented.

Comparing FP Values


How to compare floating point values.

Errors accumulate


Errors get bigger as they go.

Getting Accuracy


Getting more accuracy takes some work.

Try To Round


Add digits and round.

Setting precision


How to set the precision.

Setting the rounding mode


How to set the rounding mode.

Arbitrary Precision Integers


Arbitrary Precision Integer Arithmetic with gawk.

Checking for MPFR


How to check if MPFR is available.

POSIX Floating Point Problems


Standards Versus Existing Practice.

Floating point summary


Summary of floating point discussion.

Extension Intro


What is an extension.

Plugin License


A note about licensing.

Extension Mechanism Outline


An outline of how it works.

Extension API Description


A full description of the API.

Extension API Functions Introduction


Introduction to the API functions.

General Data Types


The data types.

Memory Allocation Functions


Functions for allocating memory.

Constructor Functions


Functions for creating values.

Registration Functions


Functions to register things with gawk.

Extension Functions


Registering extension functions.

Exit Callback Functions


Registering an exit callback.

Extension Version String


Registering a version string.

Input Parsers


Registering an input parser.

Output Wrappers


Registering an output wrapper.

Two-way processors


Registering a two-way processor.

Printing Messages


Functions for printing messages.

Updating ERRNO


Functions for updating ERRNO.

Requesting Values


How to get a value.

Accessing Parameters


Functions for accessing parameters.

Symbol Table Access


Functions for accessing global variables.

Symbol table by name


Accessing variables by name.

Symbol table by cookie


Accessing variables by “cookie”.

Cached values


Creating and using cached values.

Array Manipulation


Functions for working with arrays.

Array Data Types


Data types for working with arrays.

Array Functions


Functions for working with arrays.

Flattening Arrays


How to flatten arrays.

Creating Arrays


How to create and populate arrays.

Redirection API


How to access and manipulate redirections.

Extension API Variables


Variables provided by the API.

Extension Versioning


API Version information.

Extension GMP/MPFR Versioning


Version information about GMP and MPFR.

Extension API Informational Variables


Variables providing information about gawk’s invocation.

Extension API Boilerplate


Boilerplate code for using the API.

Changes from API V1


Changes from V1 of the API.

Finding Extensions


How gawk finds compiled extensions.

Extension Example


Example C code for an extension.

Internal File Description


What the new functions will do.

Internal File Ops


The code for internal file operations.

Using Internal File Ops


How to use an external extension.

Extension Samples


The sample extensions that ship with gawk.

Extension Sample File Functions


The file functions sample.

Extension Sample Fnmatch


An interface to fnmatch().

Extension Sample Fork


An interface to fork() and other process functions.

Extension Sample Inplace


Enabling in-place file editing.

Extension Sample Ord


Character to value to character conversions.

Extension Sample Readdir


An interface to readdir().

Extension Sample Revout


Reversing output sample output wrapper.

Extension Sample Rev2way


Reversing data sample two-way processor.

Extension Sample Read write array


Serializing an array to a file.

Extension Sample Readfile


Reading an entire file into a string.

Extension Sample Time


An interface to gettimeofday() and sleep().

Extension Sample API Tests


Tests for the API.



The gawkextlib project.

Extension summary


Extension summary.

Extension Exercises





The major changes between V7 and System V Release 3.1.



Minor changes between System V Releases 3.1 and 4.



New features from the POSIX standard.



New features from Brian Kernighan’s version of awk.



The extensions in gawk not in POSIX awk.

Feature History


The history of the features in gawk.

Common Extensions


Common Extensions Summary.

Ranges and Locales


How locales used to affect regexp ranges.



The major contributors to gawk.

History summary


History summary.

Gawk Distribution


What is in the gawk distribution.



How to get the distribution.



How to extract the distribution.

Distribution contents


What is in the distribution.

Unix Installation


Installing gawk under various versions of Unix.

Quick Installation


Compiling gawk under Unix.

Shell Startup Files


Shell convenience functions.

Additional Configuration Options


Other compile-time options.

Configuration Philosophy


How it’s all supposed to work.

Non-Unix Installation


Installation on Other Operating Systems.

PC Installation


Installing and Compiling gawk on Microsoft Windows.

PC Binary Installation


Installing a prepared distribution.

PC Compiling


Compiling gawk for Windows32.

PC Using


Running gawk on Windows32.



Building and running gawk for Cygwin.



Using gawk In The MSYS Environment.

VMS Installation


Installing gawk on VMS.

VMS Compilation


How to compile gawk under VMS.

VMS Dynamic Extensions


Compiling gawk dynamic extensions on VMS.

VMS Installation Details


How to install gawk under VMS.

VMS Running


How to run gawk under VMS.



The VMS GNV Project.

VMS Old Gawk


An old version comes with some VMS systems.



Reporting Problems and Bugs.

Bug address


Where to send reports to.



Where not to send reports to.



Maintainers of non-*nix ports.

Other Versions


Other freely available awk implementations.

Installation summary


Summary of installation.

Compatibility Mode


How to disable certain gawk extensions.



Making Additions To gawk.

Accessing The Source


Accessing the Git repository.

Adding Code


Adding code to the main body of gawk.

New Ports


Porting gawk to a new operating system.

Derived Files


Why derived files are kept in the Git repository.

Future Extensions


New features that may be implemented one day.

Implementation Limitations


Some limitations of the implementation.

Extension Design


Design notes about the extension API.

Old Extension Problems


Problems with the old mechanism.

Extension New Mechanism Goals


Goals for the new mechanism.

Extension Other Design Decisions


Some other design decisions.

Extension Future Growth


Some room for future growth.

Notes summary


Summary of implementation notes.

Basic High Level


The high level view.

Basic Data Typing


A very quick intro to data types.

Short Table of Contents

Table of Contents

Next: Foreword3, Up: (dir)   [Contents][Index]