How to sed -e 's///' everything except a specific pattern?
You might be better off using grep -o
in this case:
grep -oP '\B%[0-9]{1,3}\b' inputfile
Assuming that your version of grep
supports Perl compatible regular expressions (-P
). Otherwise:
grep -o '\B%[0-9]\{1,3\}\b' inputfile
Using GNU sed
, one could transliterate spaces to newlines and get the desired lines:
sed 'y/ /\n/' inputfile | sed '/^%[0-9]\{1,\}/!d'
$ sed 's/^.*\(%[0-9]\+\).*$/\1/' input
Assuming that a line contains at most one of those %123
tokens and that every line contains such a token.
The \( \)
meta character mark a match-group - which is then referenced in the substitution via the \1
back-reference. ^
/$
match the beginning/end of a line.
Otherwise you can pre-filter the input, e.g.:
$ grep '%[0-9]\+' input | sed 's/^.*\(%[0-9]\+\).*$/\1/'
(when not all lines contain such a token)
Another variant:
$ sed 's/\(%[0-9]\+\)/\n\1\n/g' | grep '%[0-9]'
(when a line may contain multiple of those tokens)
Here are line breaks inserted directly before and after each token - in the first part of the pipe. Then the grep
part removes all non %123
token lines.
When working with sed
it's almost always advisable to:
/address then/s/earch/replace/
There are two reasons for this. The first is that with multiple lines /addressing/
is faster - it's optimized only to find a match and doesn't bother selecting only portions of a line for editing and so it can narrow the results sooner.
The second reason is that you can play multiple edit operations off of the same address - it makes things much easier.
Of course, in this case, given only the data you show, it makes no practical difference. Still, this is how I would do the thing you ask about:
sed '/^[^%]*\|[^0-9]*$/s///g' <<\DATA
1: [18x14] [history 1/2000, 268 bytes] %3
2: [18x14] [history 1/2000, 268 bytes] %4 (active)
DATA
#OUTPUT
%3
%4
It just selects all characters that are non-% characters from the beginning of the line and all non-numeric characters from the end of the line in the address and then removes them with s///
- and that that's that.
In it's current form it might mangle data in unexpected ways if you feed it lines not containing a %digit
combo - and that's why addressing is important. If we alter it a little:
/%[0-9]/s/[^%]*\|[^0-9]*$//g
It gets safer and faster.