Multiline techniques (sed, a stream editor)
Next: Branching and flow control, Previous: Hold and Pattern Buffers, Up: advanced sed [Contents][Index]
6.3 Multiline techniques - using D,G,H,N,P to process multiple lines
Multiple lines can be processed as one buffer using the D,G,H,N,P. They are similar to their lowercase counterparts (d,g, h,n,p), except that these commands append or subtract data while respecting embedded newlines - allowing adding and removing lines from the pattern and hold spaces.
They operate as follows:
D- deletes line from the pattern space until the first newline, and restarts the cycle.
G- appends line from the hold space to the pattern space, with a newline before it.
H- appends line from the pattern space to the hold space, with a newline before it.
N- appends line from the input file to the pattern space.
P- prints line from the pattern space until the first newline.
The following example illustrates the operation of N and D commands:
$ seq 6 | sed -n 'N;l;D' 1\n2$ 2\n3$ 3\n4$ 4\n5$ 5\n6$
sedstarts by reading the first line into the pattern space (i.e. ‘1’).- At the beginning of every cycle, the
Ncommand appends a newline and the next line to the pattern space (i.e. ‘1’, ‘\n’, ‘2’ in the first cycle). - The
lcommand prints the content of the pattern space unambiguously. - The
Dcommand then removes the content of pattern space up to the first newline (leaving ‘2’ at the end of the first cycle). - At the next cycle the
Ncommand appends a newline and the next input line to the pattern space (e.g. ‘2’, ‘\n’, ‘3’).
A common technique to process blocks of text such as paragraphs (instead of line-by-line) is using the following construct:
sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
- The first expression,
/./{H;$!d}operates on all non-empty lines, and adds the current line (in the pattern space) to the hold space. On all lines except the last, the pattern space is deleted and the cycle is restarted. - The other expressions
xandsare executed only on empty lines (i.e. paragraph separators). Thexcommand fetches the accumulated lines from the hold space back to the pattern space. Thes///command then operates on all the text in the paragraph (including the embedded newlines).
The following example demonstrates this technique:
$ cat input.txt
a a a aa aaa
aaaa aaaa aa
aaaa aaa aaa
bbbb bbb bbb
bb bb bbb bb
bbbbbbbb bbb
ccc ccc cccc
cccc ccccc c
cc cc cc cc
$ sed '/./{H;$!d} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt
START-->
a a a aa aaa
aaaa aaaa aa
aaaa aaa aaa
<--END
START-->
bbbb bbb bbb
bb bb bbb bb
bbbbbbbb bbb
<--END
START-->
ccc ccc cccc
cccc ccccc c
cc cc cc cc
<--END
For more annotated examples, see Text search across multiple lines and Line length adjustment.
Next: Branching and flow control, Previous: Hold and Pattern Buffers, Up: advanced sed [Contents][Index]