Creating Arrays (The GNU Awk User’s Guide)

From Get docs
Gawk/docs/latest/Creating-Arrays


17.4.11.4 How To Create and Populate Arrays

Besides working with arrays created by awk code, you can create arrays and populate them as you see fit, and then awk code can access them and manipulate them.

There are two important points about creating arrays from extension code:

  • You must install a new array into gawk’s symbol table immediately upon creating it. Once you have done so, you can then populate the array. Similarly, if installing a new array as a subarray of an existing array, you must add the new array to its parent before adding any elements to it. Thus, the correct way to build an array is to work “top down.” Create the array, and immediately install it in gawk’s symbol table using sym_update(), or install it as an element in a previously existing array using set_array_element(). We show example code shortly.
  • Due to gawk internals, after using sym_update() to install an array into gawk, you have to retrieve the array cookie from the value passed in to sym_update() before doing anything else with it, like so:

    awk_value_t val; awk_array_t new_array; new_array = create_array(); val.val_type = AWK_ARRAY; val.array_cookie = new_array; /* install array in the symbol table */ sym_update("array", & val); new_array = val.array_cookie; /* YOU MUST DO THIS */

    If installing an array as a subarray, you must also retrieve the value of the array cookie after the call to set_element().

The following C code is a simple test extension to create an array with two regular elements and with a subarray. The leading #include directives and boilerplate variable declarations (see section Boilerplate Code) are omitted for brevity. The first step is to create a new array and then install it in the symbol table:

/* create_new_array --- create a named array */

static void
create_new_array()
{
    awk_array_t a_cookie;
    awk_array_t subarray;
    awk_value_t index, value;

    a_cookie = create_array();
    value.val_type = AWK_ARRAY;
    value.array_cookie = a_cookie;

    if (! sym_update("new_array", & value))
        printf("create_new_array: sym_update(\"new_array\") failed!\n");
    a_cookie = value.array_cookie;

Note how a_cookie is reset from the array_cookie field in the value structure.

The second step is to install two regular values into new_array:

    (void) make_const_string("hello", 5, & index);
    (void) make_const_string("world", 5, & value);
    if (! set_array_element(a_cookie, & index, & value)) {
        printf("fill_in_array: set_array_element failed\n");
        return;
    }

    (void) make_const_string("answer", 6, & index);
    (void) make_number(42.0, & value);
    if (! set_array_element(a_cookie, & index, & value)) {
        printf("fill_in_array: set_array_element failed\n");
        return;
    }

The third step is to create the subarray and install it:

    (void) make_const_string("subarray", 8, & index);
    subarray = create_array();
    value.val_type = AWK_ARRAY;
    value.array_cookie = subarray;
    if (! set_array_element(a_cookie, & index, & value)) {
        printf("fill_in_array: set_array_element failed\n");
        return;
    }
    subarray = value.array_cookie;

The final step is to populate the subarray with its own element:

    (void) make_const_string("foo", 3, & index);
    (void) make_const_string("bar", 3, & value);
    if (! set_array_element(subarray, & index, & value)) {
        printf("fill_in_array: set_array_element failed\n");
        return;
    }
}

Here is a sample script that loads the extension and then dumps the array:

@load "subarray"

function dumparray(name, array,     i)
{
    for (i in array)
        if (isarray(array[i]))
            dumparray(name "[\"" i "\"]", array[i])
        else
            printf("%s[\"%s\"] = %s\n", name, i, array[i])
}

BEGIN {
    dumparray("new_array", new_array);
}

Here is the result of running the script:

$ AWKLIBPATH=$PWD gawk -f subarray.awk
-| new_array["subarray"]["foo"] = bar
-| new_array["hello"] = world
-| new_array["answer"] = 42

(See section How gawk Finds Extensions for more information on the AWKLIBPATH environment variable.)