Getline Notes (The GNU Awk User’s Guide)

From Get docs
Gawk/docs/latest/Getline-Notes


4.10.9 Points to Remember About getline

Here are some miscellaneous points about getline that you should bear in mind:

  • When getline changes the value of $0 and NF, awk does not automatically jump to the start of the program and start testing the new record against every pattern. However, the new record is tested against any subsequent rules.
  • Some very old awk implementations limit the number of pipelines that an awk program may have open to just one. In gawk, there is no such limit. You can open as many pipelines (and coprocesses) as the underlying operating system permits.
  • An interesting side effect occurs if you use getline without a redirection inside a BEGIN rule. Because an unredirected getline reads from the command-line data files, the first getline command causes awk to set the value of FILENAME. Normally, FILENAME does not have a value inside BEGIN rules, because you have not yet started to process the command-line data files. (d.c.) (See The BEGIN and END Special Patterns; also see section Built-in Variables That Convey Information.)
  • Using FILENAME with getline (‘getline < FILENAME’) is likely to be a source of confusion. awk opens a separate input stream from the current input file. However, by not using a variable, $0 and NF are still updated. If you’re doing this, it’s probably by accident, and you should reconsider what it is you’re trying to accomplish.
  • Summary of getline Variants, presents a table summarizing the getline variants and which variables they can affect. It is worth noting that those variants that do not use redirection can cause FILENAME to be updated if they cause awk to start reading a new input file.
  • If the variable being assigned is an expression with side effects, different versions of awk behave differently upon encountering end-of-file. Some versions don’t evaluate the expression; many versions (including gawk) do. Here is an example, courtesy of Duncan Moore:

    BEGIN {
        system("echo 1 > f")
        while ((getline a[++c] < "f") > 0) { }
        print c
    }

    Here, the side effect is the ‘++c’. Is c incremented if end-of-file is encountered before the element in a is assigned?

    gawk treats getline like a function call, and evaluates the expression ‘a[++c]’ before attempting to read from f. However, some versions of awk only evaluate the expression once they know that there is a string value to be assigned.