17.7.6 Reading Directories

The readdir extension adds an input parser for directories. The usage is as follows:

@load "readdir"

When this extension is in use, instead of skipping directories named on the command line (or with getline), they are read, with each entry returned as a record.

The record consists of three fields. The first two are the inode number and the file name, separated by a forward slash character. On systems where the directory entry contains the file type, the record has a third field (also separated by a slash), which is a single letter indicating the type of the file. The letters and their corresponding file types are shown in Table 17.4.

Letter File type
b Block device
c Character device
d Directory
f Regular file
l Symbolic link
p Named pipe (FIFO)
s Socket
u Anything else (unknown)

Table 17.4: File types returned by the readdir extension


On systems without the file type information, the third field is always ‘u’.

NOTE: On GNU/Linux systems, there are filesystems that don’t support the

d_type entry (see the readdir(3) manual page), and so the file type is always ‘u’. You can use the filefuncs extension to call stat() in order to get correct type information.

By default, if a directory cannot be opened (due to permission problems, for example), gawk will exit. As with regular files, this situation can be handled using a BEGINFILE rule that checks ERRNO and prints an error or otherwise handles the problem.

Here is an example:

@load "readdir"
…
BEGIN { FS = "/" }
{ print "file name is", $2 }