what is the fastest substring search method in Java

The accepted answer is not correct and not complete.

  • indexOf() does a naive string search using backtracking on mismatches. This is quite fast on small patterns/texts but shows very poor performance on large texts
  • contains("ja") should be comparable to indexOf (because it delegates to it)
  • matches("ja") will not deliver the correct result, because it searches for an exact match (only the string "ja" will match exactly)
  • Pattern p = Pattern.compile("ja"); Matcher m = p.matcher("jack"); m.find(); would be the correct way to find texts with regular expressions. In practice (using large texts) it will be the most efficient way using only the java api. This is because a constant pattern (like "ja") will not be processed by the regex engine (which is slow) but by an Boyer-Moore-Algorithm (which is fast)

As far as the three you asked about, a regular expression is going to be much slower because it requires putting together a full state machine when you have a much simpler target. For contains vs indexOf...

2114 public boolean contains(CharSequence s) {
2115     return indexOf(s.toString()) > -1;
2116 }

(i.e., contains just calls indexOf, but you might incur an extra String creation on each invocation. This is just one implementation of contains, but since the contract of contains is a simplification of indexOf, this is probably how every implementation will work.)


String[] names = new String[]{"jack", "jackson", "jason", "dijafu"};
long start = 0;
long stop = 0;

//Contains
start = System.nanoTime();
for (int i = 0; i < names.length; i++){
    names[i].contains("ja");
}
stop = System.nanoTime();
System.out.println("Contains: " + (stop-start));

//IndexOf
start = System.nanoTime();
for (int i = 0; i < names.length; i++){
    names[i].indexOf("ja");
}
stop = System.nanoTime();
System.out.println("IndexOf: " + (stop-start));

//Matches
start = System.nanoTime();
for (int i = 0; i < names.length; i++){
    names[i].matches("ja");
}
stop = System.nanoTime();
System.out.println("Matches: " + (stop-start));

Output:

Contains: 16677
IndexOf: 4491
Matches: 864018