Java, find intersection of two arrays
The simplest solution would be to use sets, as long as you don't care that the elements in the result will have a different order, and that duplicates will be removed. The input arrays array1
and array2
are the Integer[]
subarrays of the given int[]
arrays corresponding to the number of elements that you intend to process:
Set<Integer> s1 = new HashSet<Integer>(Arrays.asList(array1));
Set<Integer> s2 = new HashSet<Integer>(Arrays.asList(array2));
s1.retainAll(s2);
Integer[] result = s1.toArray(new Integer[s1.size()]);
The above will return an Integer[]
, if needed it's simple to copy and convert its contents into an int[]
.
With duplicate elements in array finding intersection.
int [] arr1 = {1,2,2,2,2,2,2,3,6,6,6,6,6,6,};
int [] arr2 = {7,5,3,6,6,2,2,3,6,6,6,6,6,6,6,6,};
Arrays.sort(arr1);
Arrays.sort(arr2);
ArrayList result = new ArrayList<>();
int i =0 ;
int j =0;
while(i< arr1.length && j<arr2.length){
if (arr1[i]>arr2[j]){
j++;
}else if (arr1[i]<arr2[j]){
i++;
}else {
result.add(arr1[i]);
i++;
j++;
}
}
System.out.println(result);
If you are fine with java-8, then the simplest solution I can think of is using streams and filter. An implementation is as follows:
public static int[] intersection(int[] a, int[] b) {
return Arrays.stream(a)
.distinct()
.filter(x -> Arrays.stream(b).anyMatch(y -> y == x))
.toArray();
}
General test
The answers provide several solutions, so I decided to figure out which one is the most effective.
Solutions
- HashSet based by
Óscar López
- Stream based by
Bilesh Ganguly
- Foreach based by
Ruchira Gayan Ranaweera
- HashMap based by
ikarayel
What we have
- Two
String
arrays that contain 50% of the common elements. - Every element in each array is unique, so there are no duplicates
Testing code
public static void startTest(String name, Runnable test){
long start = System.nanoTime();
test.run();
long end = System.nanoTime();
System.out.println(name + ": " + (end - start) / 1000000. + " ms");
}
With use:
startTest("HashMap", () -> intersectHashMap(arr1, arr2));
startTest("HashSet", () -> intersectHashSet(arr1, arr2));
startTest("Foreach", () -> intersectForeach(arr1, arr2));
startTest("Stream ", () -> intersectStream(arr1, arr2));
Solutions code:
HashSetpublic static String[] intersectHashSet(String[] arr1, String[] arr2){
HashSet<String> set = new HashSet<>(Arrays.asList(arr1));
set.retainAll(Arrays.asList(arr2));
return set.toArray(new String[0]);
}
Stream
public static String[] intersectStream(String[] arr1, String[] arr2){
return Arrays.stream(arr1)
.distinct()
.filter(x -> Arrays.asList(arr2).contains(x))
.toArray(String[]::new);
}
Foreach
public static String[] intersectForeach(String[] arr1, String[] arr2){
ArrayList<String> result = new ArrayList<>();
for(int i = 0; i < arr1.length; i++){
for(int r = 0; r < arr2.length; r++){
if(arr1[i].equals(arr2[r]))
result.add(arr1[i]);
}
}
return result.toArray(new String[0]);
}
HashMap
public static String[] intersectHashMap(String[] arr1, String[] arr2){
HashMap<String, Integer> map = new HashMap<>();
for (int i = 0; i < arr1.length; i++)
map.put(arr1[i], 1);
ArrayList<String> result = new ArrayList<>();
for(int i = 0; i < arr2.length; i++)
if(map.containsKey(arr2[i]))
result.add(arr2[i]);
return result.toArray(new String[0]);
}
Testing process
Let's see what happens if we give the methods an array of 20
elements:
HashMap: 0.105 ms
HashSet: 0.2185 ms
Foreach: 0.041 ms
Stream : 7.3629 ms
As we can see, the Foreach method does the best job. But the Stream method is almost 180 times slower.
Let's continue the test with 500
elements:
HashMap: 0.7147 ms
HashSet: 4.882 ms
Foreach: 7.8314 ms
Stream : 10.6681 ms
In this case, the results have changed dramatically. Now the most efficient is the HashMap method.
Next test with 10 000
elements:
HashMap: 4.875 ms
HashSet: 316.2864 ms
Foreach: 505.6547 ms
Stream : 292.6572 ms
The fastest is still the HashMap method. And the Foreach method has become quite slow.
Results
If there are < 50 elements, then it is best to use the Foreach
method. He strongly breaks away in speed in this category.
In this case, the top of the best will look like this:
Foreach
HashMap
HashSet
Stream
- Better not to use in this case
But if you need to process big data, then the best option would be use the HashMap
based method.
So the top of the best look like this:
HashMap
HashSet
Stream
Foreach