What's the difference of Petri Nets and Finite State Machines?
Standard finite state machine contain only a single current state. Whereas in Petri nets multiple locations, more or less comparable with states in a finite state machine, can contain one or more tokens. A finite state machine is single threaded while a Petri net is concurrent.
In a finite state machine the active state changes in response to an event. In a Petri net transitions are executed as soon as all input locations contain at least one token.
A finite state machine can be considered as a special case of a Petri net.
In general I would recommend using a finite state machine if your process, or the part you wish to represent, is single threaded: fellow software engineers are probably more familiar with finite state machines; and there are more tools to convert a finite state machine to an implementation.
Use a Petri net only when you need the concurrency or extra expressivity. Or when you are modeling a factory plant where half fabricates are transformed into products or when your audience is more familiar with this image.
Perhaps Petri nets can also be used to model, visualize running, massive concurrent systems such as micro service architectures, azure service fabric reliable services and reliable actors, services running on kubernetus, azure function, and AWS Lambda.
In addition, there is more theoretical research about, and using, Petri nets than there is about finite state machines (note that, as I said earlier, finite state machines are reducible to Petri nets).
In State Machines, the state is global. Given two states, all you can say is "these states are different". In Petri Nets, the state is structured by places. The state is a marking, which says how many tokens are in each place. Given two markings, you can compare them and say "they are the same in places X,Y,Z but differ in places U,V,W".
When defining an FSM, you have to look at each state individually and determine the possible transitions to other states. Each transition in a Petri Net represents a whole group of transitions in the underlying reachability graph. For example, a Petri Net transition might say: From every marking that has a token in P1 and a token in P2, this model can reach a marking that has one token less in P1, and one token less in P2, but one token more in P3. If the reachability graph has 8, or 800, markings with that property, the single Petri Net transition represents 8, or 800, transitions in the reachability graph.
In Petri Net models, you can create transition invariants. Those are cycles in the reachability graph. Then you can put more tokens into the initial marking of the model, and the number of states in the reachability graph explodes. Its structure is still given by the same cycles as in a model with less tokens though, and the Petri Net model remains understandable. For example, think of a Client/Server system. You have places for the Clients, places for the Servers, places for the messages flowing back and forth. Then you just put in tokens for the numbers of Clients and Servers you want to model. They are easily changed.
As for when to use what, I agree with Kasper van den Berg.
- If you have a problem that's small enough to be handled with an FSM, then use an FSM. Maybe up to two dozen states?
- If you have a problem that naturally maps to an FSM, then use an FSM. You'll probably use an algorithm to construct the FSM in such cases. For example, parsing input with regular expressions. (Btw, many regular expression libraries have extensions that require at least a stack machine for processing.)
- If you need to create a model for distinguishable subsystems that interact with eachother, use Petri Nets. Then you can have a set of places for each subsystem, whereas an FSM would require you to create a new state for every possible combination of every substate in each subsystem.