Back-references and Subexpressions (sed, a stream editor)
5.7 Back-references and Subexpressions
back-references are regular expression commands which refer to a previous part of the matched regular expression. Back-references are specified with backslash and a single digit (e.g. ‘
\1’). The part of the regular expression they refer to is called a subexpression, and is designated with parentheses.
Back-references and subexpressions are used in two cases: in the regular expression search pattern, and in the
replacement part of the
s command (see Regular Expression Addresses and The "s" Command).
In a regular expression pattern, back-references are used to match the same content as a previously matched subexpression. In the following example, the subexpression is ‘
.’ - any single character (being surrounded by parentheses makes it a subexpression). The back-reference ‘
\1’ asks to match the same content (same character) as the sub-expression.
The command below matches words starting with any character, followed by the letter ‘
o’, followed by the same character as the first.
$ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words bob mom non pop sos tot wow
Multiple subexpressions are automatically numbered from left-to-right. This command searches for 6-letter palindromes (the first three letters are 3 subexpressions, followed by 3 back-references in reverse order):
$ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words redder
s command, back-references can be used in the
replacement part to refer back to subexpressions in the
The following example uses two subexpressions in the regular expression to match two space-separated words. The back-references in the
replacement part prints the words in a different order:
$ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./' The name is Bond, James Bond.
When used with alternation, if the group does not participate in the match then the back-reference makes the whole match fail. For example, ‘
a(.)|b\1’ will not match ‘
ba’. When multiple regular expressions are given with
-e or from a file (‘
-f file’), back-references are local to each expression.