grep string between backtick

With GNU grep that supports PCRE extension:

grep -Po '(?<=`)[^`]*(?=`)' infile

or to return a and c only for example in `a`b`c`, you would do:

grep -Po '(?<=`)[^`]+(?=`(?:[^`]*`[^`]*`)*[^`]*$)'

return everything between pair of backticks.

Tips:
(?<=...): positive look-behind.
(?=...): positive look-ahead.
(?:...): non-capturing group.
[^`]*: any character except a back-tick `

With awk:

awk -F'`' '{ for(i=2; i<=NF; i+=2) print $i; }' infile

grep, by default, returns lines matching the expression given. Your expression would therefore return the whole line as it matches the expression.

Instead of using grep, we could use sed to edit the lines that contains the string we're after.

Assuming that there is only ever a single backticked string in the line, we can chop off everything before the first backtick, and after the second:

$ sed 's/[^`]*`//; s/`.*//' <file.txt
someString

Since the backtick is not special in regular expressions, it does not need to be escaped. The backtick is special to the shell (it introduces a command substitution), but since we use a single-quoted string, the shell will not treat it as special.

One could also match the string between the backticks in a capture group (\(...\) in sed), and then replace the whole line with it (but personally, I think it makes it a bit harder to read):

$ sed 's/[^`]*`\([^`]*\)`.*/\1/' <file.txt
someString

All uses of [^`] in the above expressions mean "some character, which is not a backtick".

A totally different approach would be to first remove all backticks, and then remove everything from the blank space onward in the line:

$ tr -d '`' <file.txt | sed 's/[[:blank:]].*//'
someString

The [[:blank:]] character class matches a single space or tab character.

This obviously assumes that the backticked string is a single word with no embedded blank characters, and that it occurs first on the line, as in the example in the question.

Turning back to grep... If your grep has the non-standard (but commonly implemented) -o option to return only the bits that matched the expression from each line, then we can use that as so:

$ grep -o '`[^`]*`' <file.txt | tr -d '`'
someString

This first gets us the backticked string, and then the backticks are removed. This would behave differently from the approaches with sed if there were more than one backticked string per line. In that case, this would give you back each such string on a line of its own, while the sed variations would give you back only the first backticked string.

Use cut:

cut -d'`' -f2  < file.txt

grep string between backtick

Tags:

Grep

Regular Expression

Related

Recent Posts