What is an efficient way to replace many characters in a string?
My implementation is based on look up table.
public static String convertWord(String str) {
char[] words = str.toCharArray();
char[] find = {'á','é','ú','ý','ð','ó','ö','æ','þ','-','.',
'/'};
String[] replace = {"a","e","u","y","d","o","o","ae","th"};
StringBuilder out = new StringBuilder(str.length());
for (int i = 0; i < words.length; i++) {
boolean matchFailed = true;
for(int w = 0; w < find.length; w++) {
if(words[i] == find[w]) {
if(w < replace.length) {
out.append(replace[w]);
}
matchFailed = false;
break;
}
}
if(matchFailed) out.append(words[i]);
}
return out.toString();
}
My suggestion would be:
- Convert the String to a char[] array
- Run through the array, testing each character one by one (e.g. with a switch statement) and replacing it if needed
- Convert the char[] array back to a String
I think this is probably the fastest performance you will get in pure Java.
EDIT: I notice you are doing some changes that change the length of the string. In this case, the same principle applies, however you need to keep two arrays and increment both a source index and a destination index separately. You might also need to resize the destination array if you run out of target space (i.e. reallocate a larger array and arraycopy the existing destination array into it)
You could create a table of String[] which is Character.MAX_VALUE in length. (Including the mapping to lower case)
As the replacements got more complex, the time to perform them would remain the same.
private static final String[] REPLACEMENT = new String[Character.MAX_VALUE+1];
static {
for(int i=Character.MIN_VALUE;i<=Character.MAX_VALUE;i++)
REPLACEMENT[i] = Character.toString(Character.toLowerCase((char) i));
// substitute
REPLACEMENT['á'] = "a";
// remove
REPLACEMENT['-'] = "";
// expand
REPLACEMENT['æ'] = "ae";
}
public String convertWord(String word) {
StringBuilder sb = new StringBuilder(word.length());
for(int i=0;i<word.length();i++)
sb.append(REPLACEMENT[word.charAt(i)]);
return sb.toString();
}