Remove duplicates from array in linear time and without extra arrays

If the integers are limited 0 to n, you can move through the array, placing numbers by their indices. Every time you replace a number, take the value that used to be there and move it to where it should be. For instance, let's say we have an array of size 8:

-----------------
|3|6|3|4|5|1|7|7|
-----------------
 S

Where S is our starting point, and we'll use C to keep track of our "current" index below. We start with index 0, and move 3 to the 3 index spot, where 4 is. Save 4 in a temp var.

-----------------
|X|6|3|3|5|1|7|7|   Saved 4 
-----------------  
 S     C

We then put 4 in the index 4, saving what used to be there, 5.

-----------------
|X|6|3|3|4|1|7|7|   Saved 5
-----------------
 S       C

Keep going

-----------------
|X|6|3|3|4|5|7|7|   Saved 1
-----------------
 S         C

-----------------
|X|1|3|3|4|5|7|7|   Saved 6
-----------------
 S C

-----------------
|X|1|3|3|4|5|6|7|   Saved 7    
-----------------
 S           C 

When we try to replace 7, we see a conflict, so we simply don't place it. We then continue from the starting index S, increment it by 1:

-----------------
|X|1|3|3|4|5|6|7| 
-----------------  
   S           

1 is fine here, 3 needs to move

-----------------
|X|1|X|3|4|5|6|7|
-----------------
     S

But 3 is a duplicate, so we throw it away and keep iterating through the rest of the array.

So basically, we move each entry at most 1 time, and iterate through the entire array. That's O(2n) = O(n)


Assume int a[n] is an array of integers in the range [0,n-1]. Note that this differs slightly from the stated problem, but I make this assumption to make clear how the algorithm works. The algorithm can be patched up to work for integers in the range [0,n].

for (int i=0; i<n; i++)
{
    if (a[i] != i)
    {
         j = a[i];
         k = a[j];
         a[j] = j;  // Swap a[j] and a[i]
         a[i] = k;
     }
 }

 for (int i=0; i<n; i++)
 {
     if (a[i] == i)
     {
        printf("%d\n", i);
     }
 }

    void printRepeating(int arr[], int size)
{
  int i;
  printf("The repeating elements are: \n");
  for(i = 0; i < size; i++)
  {
    if(arr[abs(arr[i])] >= 0)
      arr[abs(arr[i])] = -arr[abs(arr[i])];
    else
      printf(" %d ", abs(arr[i]));
  }
}

Tags:

Arrays