What is the difference between shallow copy, deepcopy and normal assignment operation?
Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
Here's a little demonstration:
import copy
a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]
Using normal assignment operatings to copy:
d = c
print id(c) == id(d) # True - d is the same object as c
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]
Using a shallow copy:
d = copy.copy(c)
print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]
Using a deep copy:
d = copy.deepcopy(c)
print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # False - d[0] is now a new object
For immutable objects, there is no need for copying because the data will never change, so Python uses the same data; ids are always the same. For mutable objects, since they can potentially change, [shallow] copy creates a new object.
Deep copy is related to nested structures. If you have list of lists, then deepcopy copies
the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references.
Assignment does not copy. It simply sets the reference to the old data. So you need copy to create a new list with the same contents.
For immutable objects, creating a copy don't make much sense since they are not going to change. For mutable objects assignment
,copy
and deepcopy
behaves differently. Lets talk about each of them with examples.
An assignment operation simply assigns the reference of source to destination e.g:
>>> i = [1,2,3]
>>> j=i
>>> hex(id(i)), hex(id(j))
>>> ('0x10296f908', '0x10296f908') #Both addresses are identical
Now i
and j
technically refers to same list. Both i
and j
have same memory address. Any updation to either
of them will be reflected to the other. e.g:
>>> i.append(4)
>>> j
>>> [1,2,3,4] #Destination is updated
>>> j.append(5)
>>> i
>>> [1,2,3,4,5] #Source is updated
On the other hand copy
and deepcopy
creates a new copy of variable. So now changes to original variable will not be reflected
to the copy variable and vice versa. However copy(shallow copy)
, don't creates a copy of nested objects, instead it just
copies the reference of nested objects. Deepcopy copies all the nested objects recursively.
Some examples to demonstrate behaviour of copy
and deepcopy
:
Flat list example using copy
:
>>> import copy
>>> i = [1,2,3]
>>> j = copy.copy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different
>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable
Nested list example using copy
:
>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.copy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different
>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x10296f908') #Nested lists have same address
>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5,6]] #Updation of original nested list updated the copy as well
Flat list example using deepcopy
:
>>> import copy
>>> i = [1,2,3]
>>> j = copy.deepcopy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different
>>> i.append(4)
>>> j
>>> [1,2,3] #Updation of original list didn't affected copied variable
Nested list example using deepcopy
:
>>> import copy
>>> i = [1,2,3,[4,5]]
>>> j = copy.deepcopy(i)
>>> hex(id(i)), hex(id(j))
>>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different
>>> hex(id(i[3])), hex(id(j[3]))
>>> ('0x10296f908', '0x102b9b7c8') #Nested lists have different addresses
>>> i[3].append(6)
>>> j
>>> [1,2,3,[4,5]] #Updation of original nested list didn't affected the copied variable