How does Python know the values already stored in its memory?
If you take a look at Objects/longobject.c
, which implements the int
type for CPython, you will see that the numbers between -5 (NSMALLNEGINTS
) and 256 (NSMALLPOSINTS - 1
) are pre-allocated and cached. This is done to avoid the penalty of allocating multiple unnecessary objects for the most commonly used integers. This works because integers are immutable: you don't need multiple references to represent the same number.
Python (CPython precisely) uses shared small integers to help quick access. Integers range from [-5, 256] already exists in memory, so if you check the address, they are the same. However, for larger integers, it's not true.
a = 100000
b = 100000
a is b # False
Wait, what? If you check the address of the numbers, you'll find something interesting:
a = 1
b = 1
id(a) # 4463034512
id(b) # 4463034512
a = 257
b = 257
id(a) # 4642585200
id(b) # 4642585712
It's called integer cache. You can read more about the integer cache here.
Thanks comments from @KlausD and @user2357112 mentioning, direct access on small integers will be using integer cache, while if you do calculations, though they might equals to a number in range [-5, 256], it's not a cached integer. e.g.
pow(3, 47159012670, 47159012671) is 1 # False
pow(3, 47159012670, 47159012671) == 1 # True
“The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.”
Why? Because small integers are more frequently used by loops. Using reference to existing objects instead of creating a new object saves an overhead.