Gawk/Extension-New-Mechanism-Goals
Next: Extension Other Design Decisions, Previous: Old Extension Problems, Up: Extension Design [Contents][Index]
C.5.2 Goals For A New Mechanism
Some goals for the new API were:
- The API should be independent of
gawk
internals. Changes ingawk
internals should not be visible to the writer of an extension function. - The API should provide binary compatibility across
gawk
releases as long as the API itself does not change. - The API should enable extensions written in C or C++ to have roughly the same “appearance” to
awk
-level code asawk
functions do. This means that extensions should have:- - The ability to access function parameters.
- - The ability to turn an undefined parameter into an array (call by reference).
- - The ability to create, access and update global variables.
- - Easy access to all the elements of an array at once (“array flattening”) in order to loop over all the element in an easy fashion for C code.
- - The ability to create arrays (including
gawk
’s true arrays of arrays).
Some additional important goals were:
- The API should use only features in ISO C 90, so that extensions can be written using the widest range of C and C++ compilers. The header should include the appropriate ‘
#ifdef __cplusplus
’ and ‘extern "C"
’ magic so that a C++ compiler could be used. (If using C++, the runtime system has to be smart enough to call any constructors and destructors, asgawk
is a C program. As of this writing, this has not been tested.) - The API mechanism should not require access to
gawk
’s symbols122 by the compile-time or dynamic linker, in order to enable creation of extensions that also work on MS-Windows.
During development, it became clear that there were other features that should be available to extensions, which were also subsequently provided:
- Extensions should have the ability to hook into
gawk
’s I/O redirection mechanism. In particular, thexgawk
developers provided a so-called “open hook” to take over reading records. During development, this was generalized to allow extensions to hook into input processing, output processing, and two-way I/O. - An extension should be able to provide a “call back” function to perform cleanup actions when
gawk
exits. - An extension should be able to provide a version string so that
gawk
’s--version
option can provide information about extensions as well.
The requirement to avoid access to gawk
’s symbols is, at first
glance, a difficult one to meet.
One design, apparently used by Perl and Ruby and maybe others, would
be to make the mainline gawk
code into a library, with the
gawk
utility a small C main()
function linked against
the library.
This seemed like the tail wagging the dog, complicating build and
installation and making a simple copy of the gawk
executable
from one system to another (or one place to another on the same
system!) into a chancy operation.
Pat Rankin suggested the solution that was adopted. See section How It Works at a High Level, for the details.
Footnotes
(122)
The symbols are the variables and functions
defined inside gawk
. Access to these symbols by code
external to gawk
loaded dynamically at runtime is
problematic on MS-Windows.
Next: Extension Other Design Decisions, Previous: Old Extension Problems, Up: Extension Design [Contents][Index]