Regexp Addresses (sed, a stream editor)
Next: Range Addresses, Previous: Numeric Addresses, Up: sed addresses [Contents][Index]
4.3 selecting lines by text matching
GNU sed
supports the following regular expression addresses. The default regular expression is Basic Regular Expression (BRE). If -E
or -r
options are used, The regular expression should be in Extended Regular Expression (ERE) syntax. See BRE vs ERE.
/regexp/
This will select any line which matches the regular expression
regexp
. Ifregexp
itself includes any/
characters, each must be escaped by a backslash (\
).The following command prints lines in
/etc/passwd
which end with ‘bash
’5:sed -n '/bash$/p' /etc/passwd
The empty regular expression ‘
//
’ repeats the last regular expression match (the same holds if the empty regular expression is passed to thes
command). Note that modifiers to regular expressions are evaluated when the regular expression is compiled, thus it is invalid to specify them together with the empty regular expression.\%regexp%
(The
%
may be replaced by any other single character.)This also matches the regular expression
regexp
, but allows one to use a different delimiter than/
. This is particularly useful if theregexp
itself contains a lot of slashes, since it avoids the tedious escaping of every/
. Ifregexp
itself includes any delimiter characters, each must be escaped by a backslash (\
).The following commands are equivalent. They print lines which start with ‘
/home/alice/documents/
’:sed -n '/^\/home\/alice\/documents\//p' sed -n '\%^/home/alice/documents/%p' sed -n '\;^/home/alice/documents/;p'
/regexp/I
\%regexp%I
The
I
modifier to regular-expression matching is a GNU extension which causes theregexp
to be matched in a case-insensitive manner.In many other programming languages, a lower case
i
is used for case-insensitive regular expression matching. However, insed
thei
is used for the insert command (see insert command).Observe the difference between the following examples.
In this example,
/b/I
is the address: regular expression withI
modifier.d
is the delete command:$ printf "%s\n" a b c | sed '/b/Id' a c
Here,
/b/
is the address: a regular expression.i
is the insert command.d
is the value to insert. A line with ‘d
’ is then inserted above the matched line:$ printf "%s\n" a b c | sed '/b/id' a d b c
/regexp/M
\%regexp%M
The
M
modifier to regular-expression matching is a GNUsed
extension which directs GNUsed
to match the regular expression in multi-line mode. The modifier causes^
and$
to match respectively (in addition to the normal behavior) the empty string after a newline, and the empty string before a newline. There are special character sequences (\`
and\'
) which always match the beginning or the end of the buffer. In addition, the period character does not match a new-line character in multi-line mode.
Regex addresses operate on the content of the current pattern space. If the pattern space is changed (for example with s///
command) the regular expression matching will operate on the changed text.
In the following example, automatic printing is disabled with -n
. The s/2/X/
command changes lines containing ‘2
’ to ‘X
’. The command /[0-9]/p
matches lines with digits and prints them. Because the second line is changed before the /[0-9]/
regex, it will not match and will not be printed:
$ seq 3 | sed -n 's/2/X/ ; /[0-9]/p' 1 3
Footnotes
(5)
There are of course many other ways to do the same, e.g.
grep 'bash$' /etc/passwd awk -F: '$7 == "/bin/bash"' /etc/passwd
Next: Range Addresses, Previous: Numeric Addresses, Up: sed addresses [Contents][Index]