Efficient data structure that checks for existence of String

It depends on many factors, including the number of strings you have to feed into that data structure (do you know the number by advance, or have a basic idea?), and what you expect the hit/miss ratio to be.

A very efficient data structure to use is a trie or a radix tree; they are basically made for that. For an explanation of how they work, see the wikipedia entry (a followup to the radix tree definition is in this page). There are Java implementations (one of them is here; however I have a fixed set of strings to inject, which is why I use a builder).

If your number of strings is really huge and you don't expect a minimal miss ratio then you might also consider using a bloom filter; the problem however is that it is probabilistic; but you can get very quick answers to "not there". Here also, there are implementations in Java (Guava has an implementation for instance).

Otherwise, well, a HashSet...


If I were to use an ArrayList I believe checking for the existence of some specified string would iterate through all items until a matching string is found

Correct, checking a list for an item is linear in the number of entries of the list.

However, I am not keen on filling a HashMap where the value is completely arbitrary

You don't have to: Java provides a HashSet<T> class, which is very much like a HashMap without the value part.

You can put all your strings there, and then check for presence or absence of other strings in constant time;

Set<String> knownStrings = new HashSet<String>();
... // Fill the set with strings

if (knownString.contains(myString)) {
    ...
}

A HashSet is probably the right answer, but if you choose (for simplicity, eg) to search a list it's probably more efficient to concatenate your words into a String with separators:

String wordList = "$word1$word2$word3$word4$...";

Then create a search argument with your word between the separators:

String searchArg = "$" + searchWord + "$";

Then search with, say, contains:

bool wordFound = wordList.contains(searchArg);

You can maybe make this a tiny bit more efficient by using StringBuilder to build the searchArg.