Gawk/Getline 002fPipe

From Get docs

4.10.5 Using getline from a Pipe

Omniscience has much to recommend it.

Failing that, attention to details would be useful.

Brian Kernighan

The output of a command can also be piped into getline, using ‘command | getline’. In this case, the string command is run as a shell command and its output is piped into awk to be used as input. This form of getline reads one record at a time from the pipe. For example, the following program copies its input to its output, except for lines that begin with ‘@execute’, which are replaced by the output produced by running the rest of the line as a shell command:

{
     if ($1 == "@execute") {
          tmp = substr($0, 10)        # Remove "@execute"
          while ((tmp | getline) > 0)
               print
          close(tmp)
     } else
          print
}

The close() function is called to ensure that if two identical ‘@execute’ lines appear in the input, the command is run for each one. See section Closing Input and Output Redirections. Given the input:

foo
bar
baz
@execute who
bletch

the program might produce:

foo
bar
baz
arnold     ttyv0   Jul 13 14:22
miriam     ttyp0   Jul 13 14:23     (murphy:0)
bill       ttyp1   Jul 13 14:23     (murphy:0)
bletch

Notice that this program ran the command who and printed the result. (If you try this program yourself, you will of course get different results, depending upon who is logged in on your system.)

This variation of getline splits the record into fields, sets the value of NF, and recomputes the value of $0. The values of NR and FNR are not changed. RT is set.

According to POSIX, ‘expression | getline’ is ambiguous if expression contains unparenthesized operators other than ‘$’—for example, ‘"echo " "date" | getline’ is ambiguous because the concatenation operator is not parenthesized. You should write it as ‘("echo " "date") | getline’ if you want your program to be portable to all awk implementations.

NOTE: Unfortunately, gawk has not been consistent in its treatment

of a construct like ‘"echo " "date" | getline’. Most versions, including the current version, treat it as ‘("echo " "date") | getline’. (This is also how BWK awk behaves.) Some versions instead treat it as ‘"echo " ("date" | getline)’. (This is how mawk behaves.) In short, always use explicit parentheses, and then you won’t have to worry.