Why can I not group sed commands after an address in a block?
To output all lines of a file until the matching of a particular pattern (and to not output that matching line), you may use
sed -n '/PATTERN/q; p;' file
Here, the default output of the pattern space at the end of each cycle is disabled with -n
. Instead we explicitly output each line with p
. If the given pattern matches, we halt processing with q
.
Your actual, longer, command, which changes the name of chromosome 21 from just 21
to chr21
on the first line of a fasta file, and then proceeds to extract the DNA for that chromosome until it hits the next fasta header line, may be written as
sed -n -e '1 { s/^>21/>chr21/p; d; }' \
-e '/^>/q' \
-e p <in.fasta >out.fasta
or
sed -n '1 { s/^>21/>chr21/p; d; }; /^>/q; p' <in.fasta >out.fasta
The issue with your original expression is that the d
starts a new cycle (i.e., it forces the next line to be read into the pattern space and there's a jump to the start of the script). This means q
would never be executed.
Note that to be syntactically correct on non-GNU systems, your original script should look like /PATTERN/ { d; q; }
. Note the added ;
after q
(the spaces are not significant).
d
does not just delete the pattern space: from the POSIX specification
[2addr]d
Delete the pattern space and start the next cycle.
(my emphasis)
The q
command is unreachable.