Sed/Back 002dreferences-and-Subexpressions
Next: Escapes, Previous: regexp extensions, Up: sed regular expressions [Contents][Index]
5.7 Back-references and Subexpressions
back-references are regular expression commands which refer to a
previous part of the matched regular expression. Back-references are
specified with backslash and a single digit (e.g. ‘\1
’). The
part of the regular expression they refer to is called a
subexpression, and is designated with parentheses.
Back-references and subexpressions are used in two cases: in the
regular expression search pattern, and in the replacement
part
of the s
command (see Regular
Expression Addresses and The "s" Command).
In a regular expression pattern, back-references are used to match
the same content as a previously matched subexpression. In the
following example, the subexpression is ‘.
’ - any single
character (being surrounded by parentheses makes it a
subexpression). The back-reference ‘\1
’ asks to match the same
content (same character) as the sub-expression.
The command below matches words starting with any character,
followed by the letter ‘o
’, followed by the same character as the
first.
$ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words bob mom non pop sos tot wow
Multiple subexpressions are automatically numbered from left-to-right. This command searches for 6-letter palindromes (the first three letters are 3 subexpressions, followed by 3 back-references in reverse order):
$ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words redder
In the s
command, back-references can be
used in the replacement
part to refer back to subexpressions in
the regexp
part.
The following example uses two subexpressions in the regular
expression to match two space-separated words. The back-references in
the replacement
part prints the words in a different order:
$ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./' The name is Bond, James Bond.
When used with alternation, if the group does not participate in the
match then the back-reference makes the whole match fail. For
example, ‘a(.)|b\1
’ will not match ‘ba
’. When multiple
regular expressions are given with -e
or from a file
(‘-f file
’), back-references are local to each expression.
Next: Escapes, Previous: regexp extensions, Up: sed regular expressions [Contents][Index]