Difference between len() and .__len__()?

len is a function to get the length of a collection. It works by calling an object's __len__ method. __something__ attributes are special and usually more than meets the eye, and generally should not be called directly.

It was decided at some point long ago getting the length of something should be a function and not a method code, reasoning that len(a)'s meaning would be clear to beginners but a.len() would not be as clear. When Python started __len__ didn't even exist and len was a special thing that worked with a few types of objects. Whether or not the situation this leaves us makes total sense, it's here to stay.


You can think of len() as being roughly equivalent to

def len(x):
    return x.__len__()

One advantage is that it allows you to write things like

somelist = [[1], [2, 3], [4, 5, 6]]
map(len, somelist) 

instead of

map(list.__len__, somelist)

or

map(operator.methodcaller('__len__'), somelist)

There is slightly different behaviour though. For example in the case of ints

>>> (1).__len__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute '__len__'
>>> len(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'int' has no len()

It's often the case that the "typical" behavior of a built-in or operator is to call (with different and nicer syntax) suitable magic methods (ones with names like __whatever__) on the objects involved. Often the built-in or operator has "added value" (it's able to take different paths depending on the objects involved) -- in the case of len vs __len__, it's just a bit of sanity checking on the built-in that is missing from the magic method:

>>> class bah(object):
...   def __len__(self): return "an inch"
... 
>>> bah().__len__()
'an inch'
>>> len(bah())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer

When you see a call to the len built-in, you're sure that, if the program continues after that rather than raising an exception, the call has returned an integer, non-negative, and <= sys.maxsize -- when you see a call to xxx.__len__(), you have no certainty (except that the code's author is either unfamiliar with Python or up to no good;-).

Other built-ins provide even more added value beyond simple sanity checks and readability. By uniformly designing all of Python to work via calls to builtins and use of operators, never through calls to magic methods, programmers are spared from the burden of remembering which case is which. (Sometimes an error slips in: until 2.5, you had to call foo.next() -- in 2.6, while that still works for backwards compatibility, you should call next(foo), and in 3.*, the magic method is correctly named __next__ instead of the "oops-ey" next!-).

So the general rule should be to never call a magic method directly (but always indirectly through a built-in) unless you know exactly why you need to do that (e.g., when you're overriding such a method in a subclass, if the subclass needs to defer to the superclass that must be done through explicit call to the magic method).

Tags:

Python