How to Combine find and grep for a complex search? ( GNU/linux, find, grep )
Try
find /srv/www/*/htdocs/system/application/ -name "*.php" -exec grep "debug (" {} \; -print
This should recursively search the folders under application
for files with .php
extension and pass them to grep
.
An optimization on this would be to execute:
find /srv/www/*/htdocs/system/application/ -name "*.php" -print0 | xargs -0 grep -H "debug ("
This uses xargs
to pass all the .php
files output by find
as arguments to a single grep
command;
e.g., grep "debug (" file1 file2 file3
. The -print0
option of find
and -0
option of xargs
ensure the spaces in file and directory names are correctly handled. The -H
option passed to grep
ensures that the filename is printed in all situations. (By default, grep
prints the filename only when multiple arguments are passed in.)
From man xargs:
-0
Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literally). Disables the end of file string, which is treated like any other argument. Useful when input items might contain white space, quote marks, or backslashes. The GNU find
-print0
option produces input suitable for this mode.
find
is not even needed for this example, one can use grep
directly (at least GNU grep
):
grep -RH --include='*.php' "debug (" /srv/www/*/htdocs/system/application/
and we are down to a single process fork.
Options:
-R, --dereference-recursive Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
-H, --with-filename Print the file name for each match. This is the default when there is more than one file to search.
--include=GLOB Search only files whose base name matches GLOB (using wildcard matching as described under --exclude).
--exclude=GLOB Skip any command-line file with a name suffix that matches the pattern GLOB, using wildcard matching; a name suffix is either the whole name, or any suffix starting after a / and before a +non-/. When searching recursively, skip any subfile whose base name matches GLOB; the base name is the part after the last /. A pattern can use *, ?, and [...] as wildcards, and \ to quote a wildcard or backslash character literally.