Difference between len() and .__len__()?
len
is a function to get the length of a collection. It works by calling an object's __len__
method. __something__
attributes are special and usually more than meets the eye, and generally should not be called directly.
It was decided at some point long ago getting the length of something should be a function and not a method code, reasoning that len(a)
's meaning would be clear to beginners but a.len()
would not be as clear. When Python started __len__
didn't even exist and len
was a special thing that worked with a few types of objects. Whether or not the situation this leaves us makes total sense, it's here to stay.
You can think of len() as being roughly equivalent to
def len(x):
return x.__len__()
One advantage is that it allows you to write things like
somelist = [[1], [2, 3], [4, 5, 6]]
map(len, somelist)
instead of
map(list.__len__, somelist)
or
map(operator.methodcaller('__len__'), somelist)
There is slightly different behaviour though. For example in the case of ints
>>> (1).__len__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute '__len__'
>>> len(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'int' has no len()
It's often the case that the "typical" behavior of a built-in or operator is to call (with different and nicer syntax) suitable magic methods (ones with names like __whatever__
) on the objects involved. Often the built-in or operator has "added value" (it's able to take different paths depending on the objects involved) -- in the case of len
vs __len__
, it's just a bit of sanity checking on the built-in that is missing from the magic method:
>>> class bah(object):
... def __len__(self): return "an inch"
...
>>> bah().__len__()
'an inch'
>>> len(bah())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer
When you see a call to the len
built-in, you're sure that, if the program continues after that rather than raising an exception, the call has returned an integer, non-negative, and <= sys.maxsize
-- when you see a call to xxx.__len__()
, you have no certainty (except that the code's author is either unfamiliar with Python or up to no good;-).
Other built-ins provide even more added value beyond simple sanity checks and readability. By uniformly designing all of Python to work via calls to builtins and use of operators, never through calls to magic methods, programmers are spared from the burden of remembering which case is which. (Sometimes an error slips in: until 2.5, you had to call foo.next()
-- in 2.6, while that still works for backwards compatibility, you should call next(foo)
, and in 3.*
, the magic method is correctly named __next__
instead of the "oops-ey" next
!-).
So the general rule should be to never call a magic method directly (but always indirectly through a built-in) unless you know exactly why you need to do that (e.g., when you're overriding such a method in a subclass, if the subclass needs to defer to the superclass that must be done through explicit call to the magic method).