Snakemake using a rule in a loop
I think this is a nice opportunity to use recursive programming. Rather than explicitly including conditionals for every iteration, write a single rule that transitions from iteration (n-1)
to n
. So, something along these lines:
SAMPLES = ["SampleA", "SampleB"]
rule all:
input:
expand("loop3/{sample}.txt", sample=SAMPLES)
def recurse_sample(wcs):
n = int(wcs.n)
if n == 1:
return "test/%s.txt" % wcs.sample
elif n > 1:
return "loop%d/%s.txt" % (n-1, wcs.sample)
else:
raise ValueError("loop numbers must be 1 or greater: received %s" % wcs.n)
rule loop_n:
input: recurse_sample
output: "loop{n}/{sample}.txt"
wildcard_constraints:
sample="[^/]+",
n="[0-9]+"
shell:
"""
awk -v loop='loop{wildcards.n}' '{{print $0, loop}}' {input} > {output}
"""
As @RussHyde said, you need to be proactive about ensuring no infinite loops are triggered. To this end, we ensure all cases are covered in recurse_sample
and use wildcard_constraints
to make sure the matching is precise.
My understanding is that your rules are converted to python code before they are ran and that all the raw python code present in your Snakefile is ran sequentially during this process. Think of it as your snakemake rules being evaluated as python functions.
But there's a constraint that any rule can only be evaluated to a function once.
You can have if/else expressions and differentially evaluate a rule (once) based on config values etc, but you can't evaluate a rule multiple times.
I'm not really sure how to rewrite your Snakefile to achieve what you want. Is there a real example that you could give where looping constructs appear to be required?
--- Edit
For fixed number of iterations, it may be possible to use an input-function to run the rule several times. (I would caution against doing this though, be extremely careful to disallow infinite loops)
SAMPLES = ["SampleA", "SampleB"]
rule all:
input:
# Output of the final loop
expand("loop3/{sample}.txt", sample = SAMPLES)
def looper_input(wildcards):
# could be written more cleanly with a dictionary
if (wildcards["prefix"] == "loop0"):
input = "test/{}.txt".format(wildcards["sample"])
else if (wildcards["prefix"] == "loop1"):
input = "loop0/{}.txt".format(wildcards["sample"])
...
return input
rule looper:
input:
looper_input
output:
"{prefix}/{sample}.txt"
params:
# ? should this be add="{prefix}" ?
add=prefix
shell:
"awk '{{print $0, {params.add}}}' {input} > {output}"