Parallelise nested for-loop in IPython
To parallelize every call, you just need to get a list for each argument. You can use itertools.product
+ zip
to get this:
allzeniths, allazimuths = zip(*itertools.product(zeniths, azimuths))
Then you can use map:
amr = dview.map(f, allzeniths, allazimuths)
To go a bit deeper into the steps, here's an example:
zeniths = range(1,4)
azimuths = range(6,8)
product = list(itertools.product(zeniths, azimuths))
# [(1, 6), (1, 7), (2, 6), (2, 7), (3, 6), (3, 7)]
So we have a "list of pairs", but what we really want is a single list for each argument, i.e. a "pair of lists". This is exactly what the slightly weird zip(*product)
syntax gets us:
allzeniths, allazimuths = zip(*itertools.product(zeniths, azimuths))
print allzeniths
# (1, 1, 2, 2, 3, 3)
print allazimuths
# (6, 7, 6, 7, 6, 7)
Now we just map our function onto those two lists, to parallelize nested for loops:
def f(z,a):
return z*a
view.map(f, allzeniths, allazimuths)
And there's nothing special about there being only two - this method should extend to an arbitrary number of nested loops.
I assume you are using IPython 0.11 or later. First of all define a simple function.
def foo(azimuth, zenith):
# Do various bits of stuff
# Eventually get a result
return result
then use IPython's fine parallel suite to parallelize your problem. first start a controller with 5 engines attached (#CPUs + 1) by starting a cluster in a terminal window (if you installed IPython 0.11 or later this program should be present):
ipcluster start -n 5
In your script connect to the controller and transmit all your tasks. The controller will take care of everything.
from IPython.parallel import Client
c = Client() # here is where the client establishes the connection
lv = c.load_balanced_view() # this object represents the engines (workers)
tasks = []
for azimuth in azimuths:
for zenith in zeniths:
tasks.append(lv.apply(foo, azimuth, zenith))
result = [task.get() for task in tasks] # blocks until all results are back