Search for a word in a String
Method one should be faster because it has lesser overhead. if it is about performance in searching in huge files a specialized method like boyer moore pattern matching could lead to further improvements.
If you are looking for a fixed string, not a pattern, as in the example in your question, indexOf
will be better (simpler) and faster, since it does not need to use regular expressions.
Also, if the string you are searching for does contain characters that have a special meaning in regular expressions, with indexOf
you don't need to worry about escaping these characters.
In general, use indexOf
where possible, and match
for pattern matching, where indexOf
cannot do what you need.
If you don't care whether it's actually the entire word you're matching, then indexOf()
will be a lot faster.
If, on the other hand, you need to be able to differentiate between are
, harebrained
, aren't
etc., then you need a regex: \bare\b
will only match are
as an entire word (\\bare\\b
in Java).
\b
is a word boundary anchor, and it matches the empty space between an alphanumeric character (letter, digit, or underscore) and a non-alphanumeric character.
Caveat: This also means that if your search term isn't actually a word (let's say you're looking for ###
), then these word boundary anchors will only match in a string like aaa###zzz
, but not in +++###+++
.
Further caveat: Java has by default a limited worldview on what constitutes an alphanumeric character. Only ASCII letters/digits (plus the underscore) count here, so word boundary anchors will fail on words like élève
, relevé
or ärgern
. Read more about this (and how to solve this problem) here.