locate vs find: usage, pros and cons of each other
locate(1)
has only one big advantage over find(1)
: speed.
find(1)
, though, has many advantages over locate(1)
:
find(1)
is primordial, going back to the very first version of AT&T Unix. You will even find it in cut-down embedded Linuxes via Busybox. It is all but universal.locate(1)
is much younger thanfind(1)
. The earliest ancestor oflocate(1)
didn't appear until 1983, and it wasn't widely available as "locate
" until 1994, when it was adopted into GNU findutils and into 4.4BSD.locate(1)
is also nonstandard, thus it is not installed by default everywhere. Some POSIX type OSes don't even offer it as an option, and where it is available, the implementation may be lacking features you want because there is no independent standard specifying the minimum feature set that must be available.There is a de facto standard, being BSD
locate(1)
, but that is only because the other two main flavors oflocate
implement all of its options:-0
,-c
,-d
,-i
,-l
,-m
,-s
, and-S
.mlocate
implements 6 additional options not in BSDlocate
:-b
,-e
,-P
,-q
,--regex
and-w
. GNUlocate
implements those six plus another four:-A
,-D
,-E
, and-p
. (I'm ignoring aliases and minor differences like-?
vs-h
vs--help
.)The BSDs and Mac OS X ship BSD
locate
.Most Linuxes ship GNU
locate
, but Red Hat Linuxes and Arch shipmlocate
instead. Debian doesn't install either in its base install, but offers both versions in its default package repositories; if both are installed at once, "locate
" runsmlocate
.Oracle has been shipping
mlocate
in Solaris since 11.2, released in December 2014. Prior to that,locate
was not installed by default on Solaris. (Presumably, this was done to reduce Solaris' command incompatibility with Oracle Linux, which is based on Red Hat Enterprise Linux, which also usesmlocate
.)IBM AIX still doesn't ship any version of
locate
, at least as of AIX 7.2, unless you install GNUfindutils
from the AIX Toolbox for Linux Applications.HP-UX also appears to lack
locate
in the base system.Older "real" Unixes generally did not include an implementation of
locate
.find(1)
has a powerful expression syntax, with many functions, Boolean operators, etc.find(1)
can select files by more than just name. It can select by:- age
- size
- owner
- file type
- timestamp
- permissions
- depth within the subtree...
When finding files by name, you can search using file globbing syntax in all versions of
find(1)
, or in GNU or BSD versions, using regular expressions.Current versions of
locate(1)
accept glob patterns asfind
does, but BSDlocate
doesn't do regexes at all. If you're like me and have to use a variety of machine types, you find yourself preferringgrep
filtering to developing a dependence on-r
or--regex
.locate
needs strong filtering more thanfind
does because...find(1)
doesn't necessarily search the entire filesystem. You typically point it at a subdirectory, a parent containing all the files you want it to operate on. The typical behavior for alocate(1)
implementation is to spew up all files matching your pattern, leaving it togrep
filtering and such to cut its eruption down to size.(Evil tip:
locate /
will probably get you a list of all files on the system!)There are variants of
locate(1)
likeslocate(1)
which restrict output based on user permissions, but this is not the default version oflocate
in any major operating system.find(1)
can do things to files it finds, in addition to just finding them. The most powerful and widely supported such operator is-exec
, but there are others. In recent GNU and BSD find implementations, for example, you have the-delete
and-execdir
operators.find(1)
runs in real time, so its output is always up to date.Because
locate(1)
relies on a database updated hours or days in the past, its output can be outdated. (This is the stale cache problem.) This coin has two sides:locate
can name files that no longer exist.GNU
locate
andmlocate
have the-e
flag to make it check for file existence before printing out the name of each file it discovered in the past, but this eats away some of thelocate
speed advantage, and isn't available in BSDlocate
besides.locate
will fail to name files that were created since the last database update.
You learn to be somewhat distrustful of
locate
output, knowing it may be wrong.There are ways to solve this problem, but I am not aware of any implementation in widespread use. For example, there is
rlocate
, but it appears to not work against any modern Linux kernel.find(1)
never has any more privilege than the user running it.Because
locate
provides a global service to all users on a system, it wants to have itsupdatedb
process run asroot
so it can see the entire filesystem. This leads to a choice of security problems:Run
updatedb
as root, but make its output file world-readable solocate
can run without special privileges. This effectively exposes the names of all files in the system to all users. This may be enough of a security breach to cause a real problem.BSD
locate
is configured this way on Mac OS X and FreeBSD.Write the database as readable only by
root
, and makelocate
setuid
root so it can read the database. This meanslocate
effectively has to reimplement the OS's permission system so it doesn't show you files you can't normally see. It also increases the attack surface of your system, specifically risking a root escalation attack.Create a special "
locate
" user or group to own the database file, and mark thelocate
binary assetuid/setgid
for that user/group so it can read the database. This doesn't prevent privilege escalation attacks by itself, but it greatly mitigates the damage one could cause.mlocate
is configured this way on Red Hat Enterprise Linux.You still have a problem, though, because if you can use a debugger on
locate
or cause it to dump core you can get at privileged parts of the database.
I don't see a way to create a truly "secure"
locate
command, short of running it separately for each user on the system, which negates much of its advantage overfind(1)
.
Bottom line, both are very useful. locate(1)
is better when you're just trying to find a particular file by name, which you know exists, but you just don't remember where it is exactly. find(1)
is better when you have a focused area to examine, or when you need any of its many advantages.
locate
uses a prebuilt database, which should be regularly updated, while find
iterates over a filesystem to locate files.
Thus, locate
is much faster than find
, but can be inaccurate if the database -can be seen as a cache- is not updated (see updatedb
command).
Also, find
can offer more granularity, as you can filter files by every attribute of it, while locate
uses a pattern matched against file names.
find
is not possible for a novice or occasional user of Unix to successfully use without careful perusal of the man page. Historically, some versions of find
didn't even default the -print
option, adding to the user-hostility.
locate
is less flexible, but far more intuitive to use in the common case.