find duplicates of items endings in a list

One approach would be to use itertools.groupby, specifying that we want to group based on the last n characters using the key argument.

Then we can flatten the list removing those sublists with only 1 item using itertools.chain and take a set to remove duplicates (or a list if you want them):

Click to copy

from itertools import groupby, chain
k = lambda x: x[-3:]
l = [list(v) for _,v in groupby(sorted(names, key=k), key=k)]
# [['tamara', 'sara'], ['john'], ['tom', 'tom']]
[i[0] for i in l if len(i) > 1]
# ['tamara', 'tom']

Accumulate names per-suffix using a dict, and then gather the results:

Click to copy

>>> from collections import defaultdict 
>>> d = defaultdict(list) 
>>> for name in names: 
...     suffix = name[-3:] 
...     d[suffix].append(name) 
... 
>>> for suffix, names in d.items(): 
...     print("-", suffix, ":", *names) 
... 
- tom : tom tom
- ohn : john
- ara : sara tamara

You can partition d.items() into singles and dupes by looking at the len(names) now.

This is an O(n) time-complexity solution, as opposed to groupby-based approaches which require pre-sorting the data at O(n log n).

find duplicates of items endings in a list

Tags:

Python

List

Related

Recent Posts