17.9 Summary

  • You can write extensions (sometimes called plug-ins) for gawk in C or C++ using the application programming interface (API) defined by the gawk developers.
  • Extensions must have a license compatible with the GNU General Public License (GPL), and they must assert that fact by declaring a variable named plugin_is_GPL_compatible.
  • Communication between gawk and an extension is two-way. gawk passes a struct to the extension that contains various data fields and function pointers. The extension can then call into gawk via the supplied function pointers to accomplish certain tasks.
  • One of these tasks is to “register” the name and implementation of new awk-level functions with gawk. The implementation takes the form of a C function pointer with a defined signature. By convention, implementation functions are named do_XXXX() for some awk-level function XXXX().
  • The API is defined in a header file named gawkapi.h. You must include a number of standard header files before including it in your source file.
  • API function pointers are provided for the following kinds of operations:
    • Allocating, reallocating, and releasing memory
    • Registration functions (you may register extension functions, exit callbacks, a version string, input parsers, output wrappers, and two-way processors)
    • Printing fatal, nonfatal, warning, and “lint” warning messages
    • Updating ERRNO, or unsetting it
    • Accessing parameters, including converting an undefined parameter into an array
    • Symbol table access (retrieving a global variable, creating one, or changing one)
    • Creating and releasing cached values; this provides an efficient way to use values for multiple variables and can be a big performance win
    • Manipulating arrays (retrieving, adding, deleting, and modifying elements; getting the count of elements in an array; creating a new array; clearing an array; and flattening an array for easy C-style looping over all its indices and elements)
  • The API defines a number of standard data types for representing awk values, array elements, and arrays.
  • The API provides convenience functions for constructing values. It also provides memory management functions to ensure compatibility between memory allocated by gawk and memory allocated by an extension.
  • All memory passed from gawk to an extension must be treated as read-only by the extension.
  • All memory passed from an extension to gawk must come from the API’s memory allocation functions. gawk takes responsibility for the memory and releases it when appropriate.
  • The API provides information about the running version of gawk so that an extension can make sure it is compatible with the gawk that loaded it.
  • It is easiest to start a new extension by copying the boilerplate code described in this chapter. Macros in the gawkapi.h header file make this easier to do.
  • The gawk distribution includes a number of small but useful sample extensions. The gawkextlib project includes several more (larger) extensions. If you wish to write an extension and contribute it to the community of gawk users, the gawkextlib project is the place to do so.