Pythonic way to chain python generator function to form a pipeline
I sometimes like to use a left fold (called reduce
in Python) for this type of situation:
from functools import reduce
def pipeline(*steps):
return reduce(lambda x, y: y(x), list(steps))
res = pipeline(range(0, 5), foo1, foo2, foo3)
Or even better:
def compose(*funcs):
return lambda x: reduce(lambda f, g: g(f), list(funcs), x)
p = compose(foo1, foo2, foo3)
res = p(range(0, 5))
I do not think foo3(foo2(foo1(range(0, 5)))) is a pythonic way to achieve my pipeline goal. Especially when the number of stages in the pipeline is large.
There is a fairly trivial, and in my opinion clear, way of chaining generators: assigning the result of each to a variable, where each can have a descriptive name.
range_iter = range(0, 5)
foo1_iter = foo1(range_iter)
foo2_iter = foo2(foo1_iter)
foo3_iter = foo3(foo2_iter)
for i in foo3_iter:
print(i)
I prefer this to a something that uses a higher order function, e.g. a reduce
or similar:
In my real cases, often each foo* generator function needs its own other parameters, which is tricky if using a
reduce
.In my real cases, the steps in the pipeline are not dynamic at runtime: it seems a bit odd/unexpected (to me) to have a pattern that seems more appropriate for a dynamic case.
It's a bit inconsistent with how regular functions are typically written where each is called explicitly, and the result of each is passed to the call of the next. Yes, I guess a bit of duplication, but I'm happy with "calling a function" being duplicated since (to me) it's really clear.
No need for an import: it uses core language features.