back-references are regular expression commands which refer to a
previous part of the matched regular expression. Back-references are
specified with backslash and a single digit (e.g. ‘
part of the regular expression they refer to is called a
subexpression, and is designated with parentheses.
Back-references and subexpressions are used in two cases: in the
regular expression search pattern, and in the
s command (see Regular
Expression Addresses and The "s" Command).
In a regular expression pattern, back-references are used to match
the same content as a previously matched subexpression. In the
following example, the subexpression is ‘
.’ - any single
character (being surrounded by parentheses makes it a
subexpression). The back-reference ‘
\1’ asks to match the same
content (same character) as the sub-expression.
The command below matches words starting with any character,
followed by the letter ‘
o’, followed by the same character as the
$ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words bob mom non pop sos tot wow
Multiple subexpressions are automatically numbered from left-to-right. This command searches for 6-letter palindromes (the first three letters are 3 subexpressions, followed by 3 back-references in reverse order):
$ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words redder
s command, back-references can be
used in the
replacement part to refer back to subexpressions in
The following example uses two subexpressions in the regular
expression to match two space-separated words. The back-references in
replacement part prints the words in a different order:
$ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./' The name is Bond, James Bond.
When used with alternation, if the group does not participate in the
match then the back-reference makes the whole match fail. For
a(.)|b\1’ will not match ‘
ba’. When multiple
regular expressions are given with
-e or from a file
-f file’), back-references are local to each expression.