Doing linear scans over an associative array is like trying to club someone
to death with a loaded Uzi.
— Larry Wall
awk language provides one-dimensional arrays
for storing groups of related strings or numbers.
awk array must have a name. Array names have the same
syntax as variable names; any valid variable name would also be a valid
array name. But one name cannot be used in both ways (as an array and
as a variable) in the same
awk superficially resemble arrays in other programming
languages, but there are fundamental differences. In
isn’t necessary to specify the size of an array before starting to use it.
Additionally, any number or string, not just consecutive integers,
may be used as an array index.
In most other languages, arrays must be declared before use,
including a specification of
how many elements or components they contain. In such languages, the
declaration causes a contiguous block of memory to be allocated for that
many elements. Usually, an index in the array must be a nonnegative integer.
For example, the index zero specifies the first element in the array, which is
actually stored at the beginning of the block of memory. Index one
specifies the second element, which is stored in memory right after the
first element, and so on. It is impossible to add more elements to the
array, because it has room only for as many elements as given in
(Some languages allow arbitrary starting and ending
15 .. 27’—but the size of the array is still fixed when
the array is declared.)
A contiguous array of four elements might look like
conceptually, if the element values are eight,
"", and 30.
Only the values are stored; the indices are implicit from the order of the values. Here, eight is the value at index zero, because eight appears in the position with zero elements before it.
awk are different—they are associative. This means
that each array is a collection of pairs—an index and its corresponding
array element value:
The pairs are shown in jumbled order because their order is irrelevant.41
One advantage of associative arrays is that new pairs can be added
at any time. For example, suppose a tenth element is added to the array
whose value is
"number ten". The result is:
Now the array is sparse, which just means some indices are missing. It has elements 0–3 and 10, but doesn’t have elements 4, 5, 6, 7, 8, or 9.
Another consequence of associative arrays is that the indices don’t have to be nonnegative integers. Any number, or even a string, can be an index. For example, the following is an array that translates words from English to French:
Here we decided to translate the number one in both spelled-out and
numeric form—thus illustrating that a single array can have both
numbers and strings as indices.
(In fact, array subscripts are always strings.
There are some subtleties to how numbers work when used as
array subscripts; this is discussed in more detail in
Using Numbers to Subscript Arrays.)
Here, the number
1 isn’t double-quoted, because
automatically converts it to a string.
The value of
IGNORECASE has no effect upon array subscripting.
The identical string value used to store an array element must be used
to retrieve it.
awk creates an array (e.g., with the
that array’s indices are consecutive integers starting at one.
(See section String-Manipulation Functions.)
awk’s arrays are efficient—the time to access an element
is independent of the number of elements in the array.
The ordering will vary among
implementations, which typically use hash tables to store array elements