Remove adjacent duplicate elements from a list

Here's the traditional way, deleting adjacent duplicates in situ, while traversing the list backwards:

Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> def dedupe_adjacent(alist):
...     for i in xrange(len(alist) - 1, 0, -1):
...         if alist[i] == alist[i-1]:
...             del alist[i]
...
>>> data = [1,2,2,3,2,2,4]; dedupe_adjacent(data); print data
[1, 2, 3, 2, 4]
>>> data = []; dedupe_adjacent(data); print data
[]
>>> data = [2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,3]; dedupe_adjacent(data); print data
[2, 3]
>>> data = [2,2,2,2,2]; dedupe_adjacent(data); print data
[2]
>>>

Update: If you want a generator but (don't have itertools.groupby or (you can type faster than you can read its docs and understand its default behaviour)), here's a six-liner that does the job:

Python 2.3.5 (#62, Feb  8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> def dedupe_adjacent(iterable):
...     prev = object()
...     for item in iterable:
...         if item != prev:
...             prev = item
...             yield item
...
>>> data = [1,2,2,3,2,2,4]; print list(dedupe_adjacent(data))
[1, 2, 3, 2, 4]
>>>

Update 2: Concerning the baroque itertools.groupby() and the minimalist object() ...

To get the dedupe_adjacent effect out of itertools.groupby(), you need to wrap a list comprehension around it to throw away the unwanted groupers:

>>> [k for k, g in itertools.groupby([1,2,2,3,2,2,4])]
[1, 2, 3, 2, 4]
>>>

... or muck about with itertools.imap and/or operators.itemgetter, as seen in another answer.

Expected behaviour with object instances is that none of them compares equal to any other instance of any class, including object itself. Consequently they are extremely useful as sentinels.

>>> object() == object()
False

It's worth noting that the Python reference code for itertools.groupby uses object() as a sentinel:

self.tgtkey = self.currkey = self.currvalue = object()

and that code does the right thing when you run it:

>>> data = [object(), object()]
>>> data
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
>>> [k for k, g in groupby(data)]
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]

Update 3: Remarks on forward-index in-situ operation

The OP's revised code:

def remove_adjacent(nums):
  i = 1
  while i < len(nums):    
    if nums[i] == nums[i-1]:
      nums.pop(i)
      i -= 1  
    i += 1
  return nums

is better written as:

def remove_adjacent(seq): # works on any sequence, not just on numbers
  i = 1
  n = len(seq)
  while i < n: # avoid calling len(seq) each time around
    if seq[i] == seq[i-1]:
      del seq[i]
      # value returned by seq.pop(i) is ignored; slower than del seq[i]
      n -= 1
    else:
      i += 1
  #### return seq #### don't do this
  # function acts in situ; should follow convention and return None

Use a generator to iterate over the elements of the list, and yield a new one only when it has changed.

itertools.groupby does exactly this.

You can modify the passed-in list if you iterate over a copy:

for elt in theList[ : ]:
    ...

Tags:

Python