Time complexity for merging two sorted arrays of size n and m
Just going through this problem myself, and the Big O is not O(m+n)
, it's actually just O(n)
.
Here's why in Pseudo Code: (NOTE: I wrote this code to handle the cases of m > n or m == n)
Merging sorted arrays A and B into C.
let ints 'i' and 'j' and 'k' = 0
while(i < A.length && j < B.length){
if(A[i] < B[j]){
C[k] = A[i]
i++
} else {
C[k] = B[j]
j++
}
k++
}
// Copies rest of A into C if A.len > B.len
while(i < A.length){
C[k] = A[i]
k++
i++
}
// Copies rest of B into C if A.len < B.len
while(j < B.length){
C[k] = B[j]
k++
j++
}
return C
Now all we know is that array of length n is larger than array of length m. This means there's a possibility that every element in array of length m could be greater than every element in array of length n, even though n > m (such as A[2,3,4,5,6]
and B[7,8,9]
), we don't know, it doesn't matter. The first loop will iterate until i
or j
is equal to the length of their respective arrays, then only one of the next two while loops will run until the rest of A or B is added to C.
So it can be seen how you end up with O(m+n)
, however, Big O typically handles worst case runtime, therefore we know that we'll iterate over the array of length n more times than m, regardless of the makeup of the elements inside the arrays, so m will always be less than n. So we drop the m to get O(n)
.
This is because n could equal infinity, and therefore since that's a possibility, then we know for a fact that m could never be infinity, and by definition must be less than infinity, and therefore by further definition has a constant value less than n (since it stops growing to infinity at some point, as only n can be infinite). And since we drop constants when determining the Big O runtime, we get O(n)
.
Obviously I'm several years late to the party, lol, but I hope that clears up the air for anyone else that may come across this in the future :)
The time to merge two sorted lists is definitely not O(m log n). It is O(n+m).
The code looks something like this:
allocate c with length n+m
i = 1
j = 1
while i < n or j < m
if i = n
copy rest of b into c
break
if j = m
copy rest of a into c
break
if a[i] < b[j]
copy a[i] into c
i=i+1
continue
Else #b[j] =< a[i]
copy b[j] into c
j=j+1
continue
Now if we don't have enough memory to allocate c this can be modified to still be O(n+m) time as most hardware (both RAM and hard disks for instance) allow for block operations. when we need to insert a single item into the middle of the array it is a single operation to move the tail end of the block over one to make room. If you were on hardware that didn't allow this then you'd perhaps have O(n) for each insert, which would then be O(n+mn) complexity. Since we only need to insert elements of the smaller array into the larger we never need to shift pieces of the larger array when the element from that one is already in the right place. Hence, the n stays the same and the complexity of the m-bit is increased. This is worst case shown when all of array b with length m is properly placed in front of array a of length n.
Edit: changed last if to else and made original if logic into a comment and added and equal sign to cover that case. Psuedocode not intended to be optimized.