Gawk/Input-Summary
From Get docs
Next: Input Exercises, Previous: Command-line directories, Up: Reading Files [Contents][Index]
4.14 Summary
- Input is split into records based on the value of
RS. The possibilities are as follows:Value of RSRecords are split on … awk/gawkAny single character That character awkThe empty string ( "")Runs of two or more newlines awkA regexp Text that matches the regexp gawk FNRindicates how many records have been read from the current input file;NRindicates how many records have been read in total.gawksetsRTto the text matched byRS.- After splitting the input into records,
awkfurther splits the records into individual fields, named$1,$2, and so on.$0is the whole record, andNFindicates how many fields there are. The default way to split fields is between whitespace characters. - Fields may be referenced using a variable, as in
$NF. Fields may also be assigned values, which causes the value of$0to be recomputed when it is later referenced. Assigning to a field with a number greater thanNFcreates the field and rebuilds the record, usingOFSto separate the fields. IncrementingNFdoes the same thing. DecrementingNFthrows away fields and rebuilds the record. - Field splitting is more complicated than record splitting:
Field separator value Fields are split … awk/gawkFS == " "On runs of whitespace awkFS == any single characterOn that character awkFS == regexpOn text matching the regexp awkFS == ""Such that each individual character is a separate field gawkFIELDWIDTHS == list of columnsBased on character position gawkFPAT == regexpOn the text surrounding text matching the regexp gawk - Using ‘
FS = "\n"’ causes the entire record to be a single field (assuming that newlines separate records). FSmay be set from the command line using the-Foption. This can also be done using command-line variable assignment.- Use
PROCINFO["FS"]to see how fields are being split. - Use
getlinein its various forms to read additional records from the default input stream, from a file, or from a pipe or coprocess. - Use
PROCINFO[file, "READ_TIMEOUT"]to cause reads to time out forfile. -
Directories on the command line are fatal for standard
awk;gawkignores them if not in POSIX mode.
Next: Input Exercises, Previous: Command-line directories, Up: Reading Files [Contents][Index]