Input Summary (The GNU Awk User’s Guide)
From Get docs
Gawk/docs/latest/Input-Summary
Next: Input Exercises, Previous: Command-line directories, Up: Reading Files [Contents][Index]
4.14 Summary
- Input is split into records based on the value of
RS
. The possibilities are as follows: Value of RSRecords are split on …awk / gawk Any single characterThat characterawk The empty string ("")Runs of two or more newlinesawk A regexpText that matches the regexpgawk FNR
indicates how many records have been read from the current input file;NR
indicates how many records have been read in total.gawk
setsRT
to the text matched byRS
.- After splitting the input into records,
awk
further splits the records into individual fields, named$1
,$2
, and so on.$0
is the whole record, andNF
indicates how many fields there are. The default way to split fields is between whitespace characters. - Fields may be referenced using a variable, as in
$NF
. Fields may also be assigned values, which causes the value of$0
to be recomputed when it is later referenced. Assigning to a field with a number greater thanNF
creates the field and rebuilds the record, usingOFS
to separate the fields. IncrementingNF
does the same thing. DecrementingNF
throws away fields and rebuilds the record. - Field splitting is more complicated than record splitting: Field separator valueFields are split …awk / gawk FS == " "On runs of whitespaceawk FS == any single characterOn that characterawk FS == regexpOn text matching the regexpawk FS == ""Such that each individual character is a separate fieldgawk FIELDWIDTHS == list of columnsBased on character positiongawk FPAT == regexpOn the text surrounding text matching the regexpgawk
- Using ‘
FS = "\n"
’ causes the entire record to be a single field (assuming that newlines separate records). FS
may be set from the command line using the-F
option. This can also be done using command-line variable assignment.- Use
PROCINFO["FS"]
to see how fields are being split. - Use
getline
in its various forms to read additional records from the default input stream, from a file, or from a pipe or coprocess. - Use
PROCINFO[file, "READ_TIMEOUT"]
to cause reads to time out forfile
. -
Directories on the command line are fatal for standard
awk
;gawk
ignores them if not in POSIX mode.
Next: Input Exercises, Previous: Command-line directories, Up: Reading Files [Contents][Index]