A.6 History of gawk Features

This section describes the features in gawk over and above those in POSIX awk, in the order they were added to gawk.

Version 2.10 of gawk introduced the following features:

Version 2.13 of gawk introduced the following features:

  • The FIELDWIDTHS variable and its effects (see section Reading Fixed-Width Data).
  • The systime() and strftime() built-in functions for obtaining and printing timestamps (see section Time Functions).
  • Additional command-line options (see section Command-Line Options):
    • - The -W lint option to provide error and portability checking for both the source code and at runtime.
    • - The -W compat option to turn off the GNU extensions.
    • - The -W posix option for full POSIX compliance.

Version 2.14 of gawk introduced the following feature:

Version 2.15 of gawk introduced the following features:

  • New variables (see section Predefined Variables):
    • - ARGIND, which tracks the movement of FILENAME through ARGV.
    • - ERRNO, which contains the system error message when getline returns -1 or close() fails.
  • The /dev/pid, /dev/ppid, /dev/pgrpid, and /dev/user special file names. These have since been removed.
  • The ability to delete all of an array at once with ‘delete array’ (see section The delete Statement).
  • Command-line option changes (see section Command-Line Options):
    • - The ability to use GNU-style long-named options that start with --.
    • - The --source option for mixing command-line and library-file source code.

Version 3.0 of gawk introduced the following features:

Version 3.1 of gawk introduced the following features:

Version 4.0 of gawk introduced the following features:

  • Variable additions:
    • - FPAT, which allows you to specify a regexp that matches the fields, instead of matching the field separator (see section Defining Fields by Content).
    • - If PROCINFO["sorted_in"] exists, ‘for(iggy in foo)’ loops sort the indices before looping over them. The value of this element provides control over how the indices are sorted before the loop traversal starts (see section Using Predefined Array Scanning Orders with gawk).
    • - PROCINFO["strftime"], which holds the default format for strftime() (see section Time Functions).
  • The special files /dev/pid, /dev/ppid, /dev/pgrpid and /dev/user were removed.
  • Support for IPv6 was added via the /inet6 special file. /inet4 forces IPv4 and /inet chooses the system default, which is probably IPv4 (see section Using gawk for Network Programming).
  • The use of ‘\s’ and ‘\S’ escape sequences in regular expressions (see section gawk-Specific Regexp Operators).
  • Interval expressions became part of default regular expressions (see section Regular Expression Operators).
  • POSIX character classes work even with --traditional (see section Regular Expression Operators).
  • break and continue became invalid outside a loop, even with --traditional (see section The break Statement, and also see The continue Statement).
  • fflush(), nextfile, and ‘delete array’ are allowed if --posix or --traditional, since they are all now part of POSIX.
  • An optional third argument to asort() and asorti(), specifying how to sort (see section String-Manipulation Functions).
  • The behavior of fflush() changed to match BWK awk and for POSIX; now both ‘fflush()’ and ‘fflush("")’ flush all open output redirections (see section Input/Output Functions).
  • The isarray() function which distinguishes if an item is an array or not, to make it possible to traverse arrays of arrays (see section Getting Type Information).
  • The patsplit() function which gives the same capability as FPAT, for splitting (see section String-Manipulation Functions).
  • An optional fourth argument to the split() function, which is an array to hold the values of the separators (see section String-Manipulation Functions).
  • Arrays of arrays (see section Arrays of Arrays).
  • The BEGINFILE and ENDFILE special patterns (see section The BEGINFILE and ENDFILE Special Patterns).
  • Indirect function calls (see section Indirect Function Calls).
  • switch / case are enabled by default (see section The switch Statement).
  • Command-line option changes (see section Command-Line Options):
    • - The -b and --characters-as-bytes options which prevent gawk from treating input as a multibyte string.
    • - The redundant --compat, --copyleft, and --usage long options were removed.
    • - The --gen-po option was finally renamed to the correct --gen-pot.
    • - The --sandbox option which disables certain features.
    • - All long options acquired corresponding short options, for use in ‘#!’ scripts.
  • Directories named on the command line now produce a warning, not a fatal error, unless --posix or --traditional are used (see section Directories on the Command Line).
  • The gawk internals were rewritten, bringing the dgawk debugger and possibly improved performance (see section Debugging awk Programs).
  • Per the GNU Coding Standards, dynamic extensions must now define a global symbol indicating that they are GPL-compatible (see section Extension Licensing).
  • In POSIX mode, string comparisons use strcoll() / wcscoll() (see section String Comparison Based on Locale Collating Order).
  • The option for raw sockets was removed, since it was never implemented (see section Using gawk for Network Programming).
  • Ranges of the form ‘[d-h]’ are treated as if they were in the C locale, no matter what kind of regexp is being used, and even if --posix (see section Regexp Ranges and Locales: A Long Sad Story).
  • Support was removed for the following systems:
    • - Atari
    • - Amiga
    • - BeOS
    • - Cray
    • - MIPS RiscOS
    • - MS-DOS with the Microsoft Compiler
    • - MS-Windows with the Microsoft Compiler
    • - NeXT
    • - SunOS 3.x, Sun 386 (Road Runner)
    • - Tandem (non-POSIX)
    • - Prestandard VAX C compiler for VAX/VMS

Version 4.1 of gawk introduced the following features:

  • Three new arrays: SYMTAB, FUNCTAB, and PROCINFO["identifiers"] (see section Built-in Variables That Convey Information).
  • The three executables gawk, pgawk, and dgawk, were merged into one, named just gawk. As a result the command-line options changed.
  • Command-line option changes (see section Command-Line Options):
    • - The -D option invokes the debugger.
    • - The -i and --include options load awk library files.
    • - The -l and --load options load compiled dynamic extensions.
    • - The -M and --bignum options enable MPFR.
    • - The -o option only does pretty-printing.
    • - The -p option is used for profiling.
    • - The -R option was removed.
  • Support for high precision arithmetic with MPFR (see section Arithmetic and Arbitrary-Precision Arithmetic with gawk).
  • The and(), or() and xor() functions changed to allow any number of arguments, with a minimum of two (see section Bit-Manipulation Functions).
  • The dynamic extension interface was completely redone (see section Writing Extensions for gawk).
  • Redirected getline became allowed inside BEGINFILE and ENDFILE (see section The BEGINFILE and ENDFILE Special Patterns).
  • The where command was added to the debugger (see section Working with the Stack).
  • Support for Ultrix was removed.

Version 4.2 of gawk introduced the following changes:

  • Changes to ENVIRON are reflected into gawk’s environment and that of programs that it runs. See section Built-in Variables That Convey Information.
  • FIELDWIDTHS was enhanced to allow skipping characters before assigning a value to a field (see section Defining Fields by Content).
  • The PROCINFO["argv"] array. See section Built-in Variables That Convey Information.
  • The maximum number of hexadecimal digits in ‘\x’ escapes is now two. See section Escape Sequences.
  • Strongly typed regexp constants of the form ‘@/…/’ (see section Strongly Typed Regexp Constants).
  • The bitwise functions changed, making negative arguments into a fatal error (see section Bit-Manipulation Functions).
  • The mktime() function now accepts an optional second argument (see section Time Functions).
  • The typeof() function (see section Getting Type Information).
  • Optimizations are enabled by default. Use -s / --no-optimize to disable optimizations.
  • For many years, POSIX specified that default field splitting only allowed spaces and tabs to separate fields, and this was how gawk behaved with --posix. As of 2013, the standard restored historical behavior, and now default field splitting with --posix also allows newlines to separate fields.
  • Nonfatal output with print and printf. See section Enabling Nonfatal Output.
  • Retryable I/O via PROCINFO[input-file, "RETRY"]; (see section Retrying Reads After Certain Input Errors).
  • Changes to the pretty-printer (see section Profiling Your awk Programs):
    • - The --pretty-print option no longer runs the awk program too.
    • - Comments in the source program are preserved and placed into the output file.
    • - Explicit parentheses for expressions in the input are preserved in the generated output.
  • Improvements to the extension API (see section Writing Extensions for gawk):
    • - The get_file() function to access open redirections.
    • - The nonfatal() function for generating nonfatal error messages.
    • - Support for GMP and MPFR values.
    • - Input parsers can now override the default field parsing mechanism by specifying explicit locations.
  • Shell startup files are supplied with the distribution and installed by ‘make install’ (see section Shell Startup Files).
  • The igawk program and its manual page are no longer installed when gawk is built. See section An Easy Way to Use Library Functions.
  • Support for MirBSD was removed.
  • Support for GNU/Linux on Alpha was removed.

Version 5.0 added the following features:

  • The PROCINFO["platform"] array element, which allows you to write code that takes the operating system / platform into account.

Version 5.1 was created to release gawk with a correct major version number for the API. This was overlooked for version 5.0, unfortunately. It added the following features:

  • The index for this manual was completely reworked.
  • Support was added for MSYS2.