Using grep vs awk
grep will most likely be faster:
# time awk '/USAGE/' imapd.log.1 | wc -l
73832
real 0m2.756s
user 0m2.740s
sys 0m0.020s
# time grep 'USAGE' imapd.log.1 | wc -l
73832
real 0m0.110s
user 0m0.100s
sys 0m0.030s
awk is a interpreted programming language, where as grep is a compiled c-code program (which is additionally optimized towards finding patterns in files).
(Note - I ran both commands twice so that caching would not potentially skew the results)
More details about interpreted languages on wikipedia.
As Stephane has rightly pointed out in comments, your mileage may vary due to the implementation of the grep and awk you use, the operating system it is on and the character set you are processing.
Use the most specific and expressive tool. The tool that best fits your use case is likely to be the fastest.
As a rough guide:
- searching for lines matching a substring or regexp? Use grep.
- selecting certain columns from a simply-delimited file? Use cut.
- performing pattern-based substitutions or ... other stuff sed can reasonably do? Use sed.
- need some combination of the above 3, or printf formatting, or general purpose loops and branches? Use awk.
When only searching for strings, and speed matters, you should almost always use grep
. It's orders of magnitude faster than awk
when it comes to just gross searching.
source The functional and performance differences of sed, awk and other Unix parsing utilities
UTILITY OPERATION TYPE EXECUTION TIME CHARACTERS PROCESSED PER SECOND
(10 ITERATIONS)
------- -------------- --------------- -------------------------------
grep search only 41 sec. 489.3 million
sed search & replace 4 min. 4 sec. 82.1 million
awk search & replace 4 min. 46 sec. 69.8 million
Python search & replace 4 min. 50 sec. 69.0 million
PHP search & replace 15 min. 44 sec. 21.2 million