Sort a list by the number of occurrences of the elements in the list
The built-in sorted
creates a list out of the sequence provided and then sorts that based on the key argument (omitting error checking):
/* copy sequence provided */
newlist = PySequence_List(seq);
/* get list.sort for the list object */
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
/* call it and then return later on */
v = _PyObject_FastCallKeywords(callable, args + 1, nargs - 1, kwnames);
This essentially translates to something like what Jean provided in his answer:
B = list(A)
B.sort(key=lambda x: A.count(x))
By making that copy B
and referencing A
in the key
function, this removes the restriction imposed by A.sort
which can't peek in itself.
It seems that A
is changed during the in-place sort process, so you cannot rely on the value of A
during the sort process.
Making a copy also works.
A=[2,1,3,4,2,2,3]
B=A[:]
A.sort(key=lambda x:B.count(x))
print(A)
Confirmed by this line in python documentation
CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python makes the list appear empty for the duration, and raises ValueError if it can detect that the list has been mutated during a sort.
I believe it's because A.sort
is modifying the list in place underneath while computing. sorted()
doesn't modify the list and returns therefore a correct result.
This is by design and intentional. CPython temporarily "disallows" access to the list while the list is being sorted in place, the behavior is documented here:
CPython implementation detail: While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined. The C implementation of Python makes the list appear empty for the duration, and raises ValueError if it can detect that the list has been mutated during a sort.
You can inspect that by printing A
inside the key function - you'll get an empty list:
In [2]: def key_function(x):
...: print(A, x)
...: return A.count(x)
...:
In [3]: A.sort(key=key_function)
([], 2)
([], 1)
([], 3)
([], 4)
([], 2)
([], 2)
([], 3)
But, if you do that for sorted()
:
In [4]: sorted(A, key=key_function)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 1)
([2, 1, 3, 4, 2, 2, 3], 3)
([2, 1, 3, 4, 2, 2, 3], 4)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 2)
([2, 1, 3, 4, 2, 2, 3], 3)
Out[4]: [1, 4, 3, 3, 2, 2, 2]
It is also documented inside the sort()
implementation:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/.