Gawk/Extension-Functions

From Get docs

17.4.5.1 Registering An Extension Function

Extension functions are described by the following record:

typedef struct awk_ext_func {
    const char *name;
    awk_value_t *(*const function)(int num_actual_args,
                                   awk_value_t *result,
                                   struct awk_ext_func *finfo);
    const size_t max_expected_args;
    const size_t min_required_args;
    awk_bool_t suppress_lint;
    void *data;        /* opaque pointer to any extra state */
} awk_ext_func_t;

The fields are:

const char *name;

The name of the new function. awk-level code calls the function by this name. This is a regular C string.

Function names must obey the rules for awk identifiers. That is, they must begin with either an English letter or an underscore, which may be followed by any number of letters, digits, and underscores. Letter case in function names is significant.

awk_value_t *(*const function)(int num_actual_args,
                              awk_value_t *result,
                              struct awk_ext_func *finfo);

This is a pointer to the C function that provides the extension’s functionality. The function must fill in *result with either a number, a string, or a regexp. gawk takes ownership of any string memory. As mentioned earlier, string memory must come from one of gawk_malloc(), gawk_calloc(), or gawk_realloc().

The num_actual_args argument tells the C function how many actual parameters were passed from the calling awk code.

The finfo parameter is a pointer to the awk_ext_func_t for this function. The called function may access data within it as desired, or not.

The function must return the value of result. This is for the convenience of the calling code inside gawk.

const size_t max_expected_args;

This is the maximum number of arguments the function expects to receive. If called with more arguments than this, and if lint checking has been enabled, then gawk prints a warning message. For more information, see the entry for suppress_lint, later in this list.

const size_t min_required_args;

This is the minimum number of arguments the function expects to receive. If called with fewer arguments, gawk prints a fatal error message and exits.

awk_bool_t suppress_lint;

This flag tells gawk not to print a lint message if lint checking has been enabled and if more arguments were supplied in the call than expected. An extension function can tell if gawk already printed at least one such message by checking if ‘num_actual_args > finfo->max_expected_args’. If so, and the function does not want more lint messages to be printed, it should set finfo->suppress_lint to awk_true.

void *data;

This is an opaque pointer to any data that an extension function may wish to have available when called. Passing the awk_ext_func_t structure to the extension function, and having this pointer available in it enable writing a single C or C++ function that implements multiple awk-level extension functions.

Once you have a record representing your extension function, you register it with gawk using this API function:

awk_bool_t add_ext_func(const char *name_space, awk_ext_func_t *func);

This function returns true upon success, false otherwise. The name_space parameter is the namespace in which to place the function (see section Namespaces in gawk). Use an empty string ("") or "awk" to place the function in the default awk namespace. The func pointer is the address of a struct representing your function, as just described.

gawk does not modify what func points to, but the extension function itself receives this pointer and can modify what it points to, thus it is purposely not declared to be const.

The combination of min_required_args, max_expected_args, and suppress_lint may be confusing. Here is how you should set things up.

Any number of arguments is valid
Set min_required_args and max_expected_args to zero and set suppress_lint to awk_true.
A minimum number of arguments is required, no limit on maximum number of arguments
Set min_required_args to the minimum required. Set max_expected_args to zero and set suppress_lint to awk_true.
A minimum number of arguments is required, a maximum number is expected
Set min_required_args to the minimum required. Set max_expected_args to the maximum expected. Set suppress_lint to awk_false.
A minimum number of arguments is required, and no more than a maximum is allowed
Set min_required_args to the minimum required. Set max_expected_args to the maximum expected. Set suppress_lint to awk_false. In your extension function, check that num_actual_args does not exceed f->max_expected_args. If it does, issue a fatal error message.