grep string between backtick
With GNU grep
that supports PCRE extension:
grep -Po '(?<=`)[^`]*(?=`)' infile
or to return a
and c
only for example in `a`b`c`
, you would do:
grep -Po '(?<=`)[^`]+(?=`(?:[^`]*`[^`]*`)*[^`]*$)'
return everything between pair of backticks.
Tips:
(?<=...)
: positive look-behind.
(?=...)
: positive look-ahead.
(?:...)
: non-capturing group.
[^`]*
: any character except a back-tick `
With awk
:
awk -F'`' '{ for(i=2; i<=NF; i+=2) print $i; }' infile
grep
, by default, returns lines matching the expression given. Your expression would therefore return the whole line as it matches the expression.
Instead of using grep
, we could use sed
to edit the lines that contains the string we're after.
Assuming that there is only ever a single backticked string in the line, we can chop off everything before the first backtick, and after the second:
$ sed 's/[^`]*`//; s/`.*//' <file.txt
someString
Since the backtick is not special in regular expressions, it does not need to be escaped. The backtick is special to the shell (it introduces a command substitution), but since we use a single-quoted string, the shell will not treat it as special.
One could also match the string between the backticks in a capture group (\(...\)
in sed
), and then replace the whole line with it (but personally, I think it makes it a bit harder to read):
$ sed 's/[^`]*`\([^`]*\)`.*/\1/' <file.txt
someString
All uses of [^`]
in the above expressions mean "some character, which is not a backtick".
A totally different approach would be to first remove all backticks, and then remove everything from the blank space onward in the line:
$ tr -d '`' <file.txt | sed 's/[[:blank:]].*//'
someString
The [[:blank:]]
character class matches a single space or tab character.
This obviously assumes that the backticked string is a single word with no embedded blank characters, and that it occurs first on the line, as in the example in the question.
Turning back to grep
... If your grep
has the non-standard (but commonly implemented) -o
option to return only the bits that matched the expression from each line, then we can use that as so:
$ grep -o '`[^`]*`' <file.txt | tr -d '`'
someString
This first gets us the backticked string, and then the backticks are removed. This would behave differently from the approaches with sed
if there were more than one backticked string per line. In that case, this would give you back each such string on a line of its own, while the sed
variations would give you back only the first backticked string.
Use cut
:
cut -d'`' -f2 < file.txt