Remove adjacent duplicate elements from a list
Here's the traditional way, deleting adjacent duplicates in situ, while traversing the list backwards:
Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> def dedupe_adjacent(alist):
... for i in xrange(len(alist) - 1, 0, -1):
... if alist[i] == alist[i-1]:
... del alist[i]
...
>>> data = [1,2,2,3,2,2,4]; dedupe_adjacent(data); print data
[1, 2, 3, 2, 4]
>>> data = []; dedupe_adjacent(data); print data
[]
>>> data = [2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,3]; dedupe_adjacent(data); print data
[2, 3]
>>> data = [2,2,2,2,2]; dedupe_adjacent(data); print data
[2]
>>>
Update: If you want a generator but (don't have itertools.groupby
or (you can type faster than you can read its docs and understand its default behaviour)), here's a six-liner that does the job:
Python 2.3.5 (#62, Feb 8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> def dedupe_adjacent(iterable):
... prev = object()
... for item in iterable:
... if item != prev:
... prev = item
... yield item
...
>>> data = [1,2,2,3,2,2,4]; print list(dedupe_adjacent(data))
[1, 2, 3, 2, 4]
>>>
Update 2: Concerning the baroque itertools.groupby()
and the minimalist object()
...
To get the dedupe_adjacent effect out of itertools.groupby()
, you need to wrap a list comprehension around it to throw away the unwanted groupers:
>>> [k for k, g in itertools.groupby([1,2,2,3,2,2,4])]
[1, 2, 3, 2, 4]
>>>
... or muck about with itertools.imap
and/or operators.itemgetter
, as seen in another answer.
Expected behaviour with object
instances is that none of them compares equal to any other instance of any class, including object
itself. Consequently they are extremely useful as sentinels.
>>> object() == object()
False
It's worth noting that the Python reference code for itertools.groupby
uses object()
as a sentinel:
self.tgtkey = self.currkey = self.currvalue = object()
and that code does the right thing when you run it:
>>> data = [object(), object()]
>>> data
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
>>> [k for k, g in groupby(data)]
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
Update 3: Remarks on forward-index in-situ operation
The OP's revised code:
def remove_adjacent(nums):
i = 1
while i < len(nums):
if nums[i] == nums[i-1]:
nums.pop(i)
i -= 1
i += 1
return nums
is better written as:
def remove_adjacent(seq): # works on any sequence, not just on numbers
i = 1
n = len(seq)
while i < n: # avoid calling len(seq) each time around
if seq[i] == seq[i-1]:
del seq[i]
# value returned by seq.pop(i) is ignored; slower than del seq[i]
n -= 1
else:
i += 1
#### return seq #### don't do this
# function acts in situ; should follow convention and return None
Use a generator to iterate over the elements of the list, and yield
a new one only when it has changed.
itertools.groupby
does exactly this.
You can modify the passed-in list if you iterate over a copy:
for elt in theList[ : ]:
...