Ordering an array with special characters like accents
If you just want to sort the strings as if they didn't have the accents, you could use the following:
Collections.sort(strs, new Comparator<String>() {
@Override
public int compare(String o1, String o2) {
o1 = Normalizer.normalize(o1, Normalizer.Form.NFD);
o2 = Normalizer.normalize(o2, Normalizer.Form.NFD);
return o1.compareTo(o2);
}
});
Related question:
- Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars
For more sophisticated use cases you will want to read up on java.text.Collator
. Here's an example:
Collections.sort(strs, new Comparator<String>() {
@Override
public int compare(String o1, String o2) {
Collator usCollator = Collator.getInstance(Locale.US);
return usCollator.compare(o1, o2);
}
});
If none of the predefined collation rules meet your needs, you can try using the java.text.RuleBasedCollator
.
You should take a look at RuleBasedCollator
RuleBasedCollator class is a concrete subclass of Collator that provides a simple, data-driven, table collator. With this class you can create a customized table-based Collator. RuleBasedCollator maps characters to sort keys.
RuleBasedCollator has the following restrictions for efficiency (other subclasses may be used for more complex languages) :
If a special collation rule controlled by a is specified it applies to the whole collator object. All non-mentioned characters are at the end of the collation order.