How to run parallel processes and combine outputs when both finished
Use wait
. For example:
Data1 ... > Data1Res.csv &
Data2 ... > Data2Res.csv &
wait
AnalysisProg
will:
- run the Data1 and Data2 pipes as background jobs
- wait for them both to finish
- run AnalysisProg.
See, e.g., this question.
cxw's answer is no doubt the preferable solution, if you only have 2 files. If the 2 files are just examples and you in reality have 10000 files, then the '&' solution will not work, as that will overload your server. For that you need a tool like GNU Parallel:
ls Data* | parallel 'cat {} | this | that |theother | grep |sed | awk |whatever > {}res.csv
AnalysisProg -i *res.csv
To learn more about GNU Parallel:
- Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
- Walk through the tutorial (man parallel_tutorial). You command line will love you for it.