Efficient way to find missing elements in an integer sequence
missingItems = [x for x in complete_list if not x in L]
Assuming that L is a list of integers with no duplicates, you can infer that the part of the list between start and index is completely consecutive if and only if L[index] == L[start] + (index - start)
and similarly with index and end is completely consecutive if and only if L[index] == L[end] - (end - index)
. This combined with splitting the list into two recursively gives a sublinear solution.
# python 3.3 and up, in older versions, replace "yield from" with yield loop
def missing_elements(L, start, end):
if end - start <= 1:
if L[end] - L[start] > 1:
yield from range(L[start] + 1, L[end])
return
index = start + (end - start) // 2
# is the lower half consecutive?
consecutive_low = L[index] == L[start] + (index - start)
if not consecutive_low:
yield from missing_elements(L, start, index)
# is the upper part consecutive?
consecutive_high = L[index] == L[end] - (end - index)
if not consecutive_high:
yield from missing_elements(L, index, end)
def main():
L = [10,11,13,14,15,16,17,18,20]
print(list(missing_elements(L,0,len(L)-1)))
L = range(10, 21)
print(list(missing_elements(L,0,len(L)-1)))
main()
a=[1,2,3,7,5,11,20]
b=[]
def miss(a,b):
for x in range (a[0],a[-1]):
if x not in a:
b.append(x)
return b
print (miss(a,b))
ANS:[4, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19]
works for sorted
,unsorted
, with duplicates
too
If the input sequence is sorted, you could use sets here. Take the start and end values from the input list:
def missing_elements(L):
start, end = L[0], L[-1]
return sorted(set(range(start, end + 1)).difference(L))
This assumes Python 3; for Python 2, use xrange()
to avoid building a list first.
The sorted()
call is optional; without it a set()
is returned of the missing values, with it you get a sorted list.
Demo:
>>> L = [10,11,13,14,15,16,17,18,20]
>>> missing_elements(L)
[12, 19]
Another approach is by detecting gaps between subsequent numbers; using an older itertools
library sliding window recipe:
from itertools import islice, chain
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def missing_elements(L):
missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
return list(missing)
This is a pure O(n) operation, and if you know the number of missing items, you can make sure it only produces those and then stops:
def missing_elements(L, count):
missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
return list(islice(missing, 0, count))
This will handle larger gaps too; if you are missing 2 items at 11 and 12, it'll still work:
>>> missing_elements([10, 13, 14, 15], 2)
[11, 12]
and the above sample only had to iterate over [10, 13]
to figure this out.