How to remove a word prefix using grep?

As the others have noted, grep is not well suited for this task, sed is a good option, or if the text is well ordered a simple cut might be easier to type:

echo www.abc.com | cut -d. -f2-

-d. tells cut to use . as a delimiter.
-f2- tells cut to return field 2 to infinity.

with grep's `--only-matching` and `\K`

You can do this with a grep's --only-matching flag:

echo "www.abc.com" | grep --perl-regexp --only-matching 'www.\K.*'

which can be shortened to

echo "www.abc.com" | grep -Po 'www.\K.*'

Both commands produce

abc.com

with grep (GNU grep) 3.3.

Instead of echo, I'll use a here string to shorten the command further:

grep -Po 'www.\K.*' <<< "www.abc.com"

\K resets the starting point of the match, essentially forgetting the matched "www.". See this for more on \K.

with grep's positive lookbehind

You can also do this with a positive lookbehind:

grep -Po '(?<=www.).*' <<< "www.abc.com"

with awk's field separator `-F`

awk -F 'www.' <<< "www.abc.com" '$2{print $2}'

This prints

abc.com

The $2{print $2} part will print the second field if it's defined. This is necessary in case of multi-line input to avoid outputting blank lines for input lines that don't contain the field separator.

You don't edit strings with grep in Unix shell, grep is usually used to find or remove some lines from the text. You'd rather use sed instead:

$ echo www.example.com | sed 's/^[^\.]\+\.//'
example.com

You'll need to learn regular expressions to use it effectively.

Sed can also edit file in-place (modify the file), if you pass -i argument, but be careful, you can easily lose data if you write the wrong sed command and use -i flag.

An example

From your comments guess you have a TeX document, and your want to remove the first part of all .com domain names. If it is your document test.tex:

\documentclass{article}
\begin{document}
www.example.com
example.com www.another.domain.com
\end{document}

then you can transform it with this sed command (redirect output to file or edit in-place with -i):

$ sed 's/\([a-z0-9-]\+\.\)\(\([a-z0-9-]\+\.\)\+com\)/\2/gi' test.tex 
\documentclass{article}
\begin{document}
example.com
example.com another.domain.com
\end{document}

Please note that:

A common sequence of allowed symbols followed by a dot is matched by [a-z0-9-]\+\.
I used groups in the regular expression (parts of it within $ and $) to indicate the first and the second part of the URL, and I replace the entire match with its second group (\2 in the substitution pattern)
The domain should be at least 3rd level .com domain (every \+ repition means at least one match)
The search is case insensitive (i flag in the end)
It can do more than match per line (g flag in the end)

You can do this using grep easily:

$ echo www.google.com | grep -o '[^.]*\.com'
google.com

Instead of echo you must give your file.

$ grep -o '[^.]*\.com$' < file

I used here the regular expression '[^.]*.com'. That means: find me a word without . in it ([^.]*), after which goes .com (\.com in re). The -o key says that grep must show only that part that was found.

How to remove a word prefix using grep?

with grep's `--only-matching` and `\K`

with grep's positive lookbehind

with awk's field separator `-F`

An example

Tags:

Linux

Shell

Regex

Sed

Related

Recent Posts

How to remove a word prefix using grep?

with grep's --only-matching and \K

with grep's positive lookbehind

with awk's field separator -F

An example

Tags:

Linux

Shell

Regex

Sed

Related

with grep's `--only-matching` and `\K`

with awk's field separator `-F`