How to run EMR Cluster Steps concurrently?
Each step is processed concurrently across the cluster. So really if you have work that can be done concurrently, you might consider having it all in the same step (each step can have 1 or more Hadoop jobs).
Typically you might use steps when you want to make sure that ALL processing that needs to be done for the following step is completed before moving to the next step. A good example of this might be when you are dealing with encrypted data, where to might have one step to decrypt the data, one step to process the data, and an additional step to re-encrypt the data before persistence.