How flattening a nested list using `sum(iterable,[])` works?
This is just a result of how Python interprets addition of lists. From the docs
sum(iterable[, start])
Sums start and the items of an iterable from left to right and returns the total.
Since sum
starts by adding the first element of the iterable to the start
argument, you have:
[] + [1, 2] = [1, 2]
Then it continues adding items from the iterable:
[1, 2] + [3, 4] = [1, 2, 3, 4]
[1, 2, 3, 4] + [5, 6] = [1, 2, 3, 4, 5, 6]
sum([a, b, c], d)
produces d + a + b + c
.
In your example, a
, b
, c
, and d
are [1, 2]
, [3, 4]
, [5, 6]
, and []
.
sum([[1, 2], [3, 4], [5, 6]], [])
produces [] + [1, 2] + [3, 4] + [5, 6]
, which is [1, 2, 3, 4, 5, 6]
because +
is concatenation for lists.
This is absurdly inefficient, because every +
operation involved requires copying all the data from each of its arguments:
In [7]: x = [[i] for i in range(30000)]
In [8]: %timeit sum(x, [])
1 loop, best of 3: 2.06 s per loop
In [9]: %timeit [elem for sublist in x for elem in sublist]
1000 loops, best of 3: 1.91 ms per loop
sum(x, [])
takes quadratic time, whereas a more efficient implementation takes linear time. Never do sum(x, [])
.
As the sum(iterable[, start])
document says:
Sums
start
and the items of aniterable
from left to right and returns the total.start
defaults to 0. Theiterable
’s items are normally numbers, and the start value is not allowed to be a string.
So, in the example you shared:
sum(a,[])
Here, iterable
is a
(which is [[1, 2], [3, 4], [5, 6]]
) and start
is []
. Hence, the resultant is equivalent to:
[] + [1, 2] + [3, 4] + [5, 6]
# i.e. you flatten list --> [1, 2, 3, 4, 5, 6]