Gawk/Reference-to-Elements

From Get docs

8.1.2 Referring to an Array Element

The principal way to use an array is to refer to one of its elements. An array reference is an expression as follows:

array[index-expression]

Here, array is the name of an array. The expression index-expression is the index of the desired element of the array.

The value of the array reference is the current value of that array element. For example, foo[4.3] is an expression referencing the element of array foo at index ‘4.3’.

A reference to an array element that has no recorded value yields a value of "", the null string. This includes elements that have not been assigned any value as well as elements that have been deleted (see section The delete Statement).

NOTE: A reference to an element that does not exist automatically creates

that array element, with the null string as its value. (In some cases, this is unfortunate, because it might waste memory inside awk.)

Novice awk programmers often make the mistake of checking if an element exists by checking if the value is empty:

# Check if "foo" exists in a:         Incorrect!
if (a["foo"] != "") …

This is incorrect for two reasons. First, it creates a["foo"] if it didn’t exist before! Second, it is valid (if a bit unusual) to set an array element equal to the empty string.

To determine whether an element exists in an array at a certain index, use the following expression:

indx in array

This expression tests whether the particular index indx exists, without the side effect of creating that element if it is not present. The expression has the value one (true) if array[indx] exists and zero (false) if it does not exist. (We use indx here, because ‘index’ is the name of a built-in function.) For example, this statement tests whether the array frequencies contains the index ‘2’:

if (2 in frequencies)
    print "Subscript 2 is present."

Note that this is not a test of whether the array frequencies contains an element whose value is two. There is no way to do that except to scan all the elements. Also, this does not create frequencies[2], while the following (incorrect) alternative does:

if (frequencies[2] != "")
    print "Subscript 2 is present."