Python: which types support weak references?
First: this is all CPython-specific. Weakrefs work differently on different Python implementations.
Most built-in types don't support weak references because Python's weak reference mechanism adds some overhead to every object that supports weak references, and the Python dev team decided they didn't want most built-in types to pay that overhead. The simplest way this overhead manifests is that any object with weak reference support needs space for an extra pointer for weakref management, and most built-in objects don't reserve space for that pointer.
Attempting to compile a complete list of all types with weak reference support is about as fruitful as trying to compile a complete list of all humans with red hair. If you want to determine whether a type has weak reference support, you can check its __weakrefoffset__
, which is nonzero for types with weak reference support:
>>> int.__weakrefoffset__
0
>>> type.__weakrefoffset__
368
>>> tuple.__weakrefoffset__
0
>>> class Foo(object):
... pass
...
>>> class Bar(tuple):
... pass
...
>>> Foo.__weakrefoffset__
24
>>> Bar.__weakrefoffset__
0
A type's __weakrefoffset__
is the offset in bytes from the start of an instance to the weakref pointer, or 0 if instances have no weakref pointer. It corresponds to the type struct's tp_weaklistoffset
at C level. As of the time of this writing, __weakrefoffset__
is completely undocumented, but tp_weaklistoffset
is documented, because people implementing extension types in C need to know about it.
There are two things that aren’t covered by user’s excellent answer.
First, weakref was added to Python in version 2.1.
For everything added after 2.1 (and that includes object
and type
), the default was to add weakref support unless there was a good reason not to.
But for everything that already existed, especially pretty small ones like int
, adding another 4 bytes (most Python implementations were 32-bit at the time, so let’s just call a pointer 4 bytes) could cause a noticeable performance regression for all of the Python code out there that had been written for 1.6/2.0 or earlier. So, there was a higher bar to pass for adding weakref support to those types.
Second, Python allows the implementation to merge values of builtin types that it can prove are immutable, and for a few of those builtin types, CPython takes advantage of that. For example (the details vary across versions, so take this only as an example):
- Integers from -5 to 255, the empty string, single-character printable ASCII strings, the empty bytes, single-byte bytes, and the empty tuple get singleton instances created at startup, and most attempts to construct a new value equal to one of these singletons instead get a reference to the singleton.
- Many strings are cached in a string intern table, and many attempts to construct a string with the same value as an interned string instead get a reference to the existing one.
- Within a single compilation unit, the compiler will merge two separate constants that are equal ints, strings, tuples of ints and strings, etc. into two references to the same constant.
So, weakrefs to these types wouldn’t be as useful as you’d initially think. Many values just aren’t ever going to go away, because they’re references to singletons or module constants or interned strings. Even those that aren’t immortal, you probably have more references to them than you expected.
Sure, there are some cases where weakrefs would be useful anyway. If I calculate a billion large integers, most of those won’t be immortal, or shared. But it means they’re useful less often for these types, which has to be a factor when weighing the tradeoffs of making every int 4 bytes larger so you can save memory by safely releasing them in some relatively uncommon cases.