python max function using 'key' and lambda expression
How does the max function work?
It looks for the "largest" item in an iterable. I'll assume that you can look up what that is, but if not, it's something you can loop over, i.e. a list or string.
What is use of the keyword key in max function? I know it is also used in context of sort function
Key
is a lambda function that will tell max
which objects in the iterable are larger than others. Say if you were sorting some object that you created yourself, and not something obvious, like integers.
Meaning of the lambda expression? How to read them? How do they work?
That's sort of a larger question. In simple terms, a lambda is a function you can pass around, and have other pieces of code use it. Take this for example:
def sum(a, b, f):
return (f(a) + f(b))
This takes two objects, a
and b
, and a function f
.
It calls f()
on each object, then adds them together. So look at this call:
>>> sum(2, 2, lambda a: a * 2)
8
sum()
takes 2
, and calls the lambda expression on it. So f(a)
becomes 2 * 2
, which becomes 4. It then does this for b
, and adds the two together.
In not so simple terms, lambdas come from lambda calculus, which is the idea of a function that returns a function; a very cool math concept for expressing computation. You can read about that here, and then actually understand it here.
It's probably better to read about this a little more, as lambdas can be confusing, and it's not immediately obvious how useful they are. Check here.
Strongly simplified version of max
:
def max(items, key=lambda x: x):
current = item[0]
for item in items:
if key(item) > key(current):
current = item
return current
Regarding lambda:
>>> ident = lambda x: x
>>> ident(3)
3
>>> ident(5)
5
>>> times_two = lambda x: 2*x
>>> times_two(2)
4
max
function is used to get the maximum out of an iterable
.
The iterators may be lists, tuples, dict objects, etc. Or even custom objects as in the example you provided.
max(iterable[, key=func]) -> value
max(a, b, c, ...[, key=func]) -> value
With a single iterable argument, return its largest item.
With two or more arguments, return the largest argument.
So, the key=func
basically allows us to pass an optional argument key
to the function on whose basis is the given iterator/arguments are sorted & the maximum is returned.
lambda
is a python keyword that acts as a pseudo function. So, when you pass player
object to it, it will return player.totalScore
. Thus, the iterable passed over to function max
will sort according to the key
totalScore of the player
objects given to it & will return the player
who has maximum totalScore
.
If no key
argument is provided, the maximum is returned according to default Python orderings.
Examples -
max(1, 3, 5, 7)
>>>7
max([1, 3, 5, 7])
>>>7
people = [('Barack', 'Obama'), ('Oprah', 'Winfrey'), ('Mahatma', 'Gandhi')]
max(people, key=lambda x: x[1])
>>>('Oprah', 'Winfrey')
lambda
is an anonymous function, it is equivalent to:
def func(p):
return p.totalScore
Now max
becomes:
max(players, key=func)
But as def
statements are compound statements they can't be used where an expression is required, that's why sometimes lambda
's are used.
Note that lambda
is equivalent to what you'd put in a return statement of a def
. Thus, you can't use statements inside a lambda
, only expressions are allowed.
What does max
do?
max(a, b, c, ...[, key=func]) -> value
With a single iterable argument, return its largest item. With two or more arguments, return the largest argument.
So, it simply returns the object that is the largest.
How does key
work?
By default in Python 2 key
compares items based on a set of rules based on the type of the objects (for example a string is always greater than an integer).
To modify the object before comparison, or to compare based on a particular attribute/index, you've to use the key
argument.
Example 1:
A simple example, suppose you have a list of numbers in string form, but you want to compare those items by their integer value.
>>> lis = ['1', '100', '111', '2']
Here max
compares the items using their original values (strings are compared lexicographically so you'd get '2'
as output) :
>>> max(lis)
'2'
To compare the items by their integer value use key
with a simple lambda
:
>>> max(lis, key=lambda x:int(x)) # compare `int` version of each item
'111'
Example 2: Applying max
to a list of tuples.
>>> lis = [(1,'a'), (3,'c'), (4,'e'), (-1,'z')]
By default max
will compare the items by the first index. If the first index is the same then it'll compare the second index. As in my example, all items have a unique first index, so you'd get this as the answer:
>>> max(lis)
(4, 'e')
But, what if you wanted to compare each item by the value at index 1? Simple: use lambda
:
>>> max(lis, key = lambda x: x[1])
(-1, 'z')
Comparing items in an iterable that contains objects of different type:
List with mixed items:
lis = ['1','100','111','2', 2, 2.57]
In Python 2 it is possible to compare items of two different types:
>>> max(lis) # works in Python 2
'2'
>>> max(lis, key=lambda x: int(x)) # compare integer version of each item
'111'
But in Python 3 you can't do that any more:
>>> lis = ['1', '100', '111', '2', 2, 2.57]
>>> max(lis)
Traceback (most recent call last):
File "<ipython-input-2-0ce0a02693e4>", line 1, in <module>
max(lis)
TypeError: unorderable types: int() > str()
But this works, as we are comparing integer version of each object:
>>> max(lis, key=lambda x: int(x)) # or simply `max(lis, key=int)`
'111'