Remove duplicate values from a string in java

Create array of string by spliting by - and then create a hashSet from it.

String s="Bangalore-Chennai-NewYork-Bangalore-Chennai"; 
String[] strArr = s.split("-");
Set<String> set = new HashSet<String>(Arrays.asList(strArr));

If you want back it as string array then do following:

String[] result = new String[set.size()];
set.toArray(result);

Here is a sample code to do this:

String s="Bangalore-Chennai-NewYork-Bangalore-Chennai"; 
String[] strArr = s.split("-");
Set<String> set = new LinkedHashSet<String>(Arrays.asList(strArr));
String[] result = new String[set.size()];
set.toArray(result);
StringBuilder res = new StringBuilder();
for (int i = 0; i < result.length; i++) {
    String string = result[i];
    if(i==result.length-1)
        res.append(string);
    else
        res.append(string).append("-");
}
System.out.println(res.toString());

Output:-

Bangalore-Chennai-NewYork

public static String removeDuplicates(String txt, String splitterRegex)
{
    List<String> values = new ArrayList<String>();
    String[] splitted = txt.split(splitterRegex);
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < splitted.length; ++i)
    {
        if (!values.contains(splitted[i]))
        {
            values.add(splitted[i]);
            sb.append('-');
            sb.append(splitted[i]);
        }
    }
    return sb.substring(1);

}

Usage:

String s = "Bangalore-Chennai-NewYork-Bangalore-Chennai";
s = removeDuplicates(s, "\\-");
System.out.println(s);

Prints:

Bangalore-Chennai-NewYork

You could add your strings to a HashSet.

  1. Split the strings on a "-".
  2. Store the individual words in Array. i.e arr[]

Sinppet :

Set<String> set = new HashSet<String>();

    for(int i=0; i < arr.length; i++){
      if(set.contains(arr[i])){
        System.out.println("Duplicate string found at index " + i);
      } else {
        set.add(arr[i]);
      }

This does it in one line:

public String deDup(String s) {
    return new LinkedHashSet<String>(Arrays.asList(s.split("-"))).toString().replaceAll("(^\\[|\\]$)", "").replace(", ", "-");
}

public static void main(String[] args) {
    System.out.println(deDup("Bangalore-Chennai-NewYork-Bangalore-Chennai"));
}

Output:

Bangalore-Chennai-NewYork

Notice that the order is preserved :)

Key points are:

  • split("-") gives us the different values as an array
  • Arrays.asList() turns the array into a List
  • LinkedHashSet preserves uniqueness and insertion order - it does all the work of giving us the unique values, which are passed via the constructor
  • the toString() of a List is [element1, element2, ...]
  • the final replace commands remove the "punctuation" from the toString()

This solution requires the values to not contain the character sequence ", " - a reasonable requirement for such terse code.

Java 8 Update!

Of course it's 1 line:

public String deDup(String s) {
    return Arrays.stream(s.split("-")).distinct().collect(Collectors.joining("-"));
}

Regex update!

If you don't care about preserving order (ie it's OK to delete the first occurrence of a duplicate):

public String deDup(String s) {
    return s.replaceAll("(\\b\\w+\\b)-(?=.*\\b\\1\\b)", "");
}