sequence detection, why use SM?
Just because you can find a different, and what seems to you easier, way to detect a toy sequence, doesn't mean that state machines are dead, or that you shouldn't do the exercise you've been set. Celebrate your creativity, it's good to think of alternatives, but work on the state machine solution as well.
It's frequently not clear to students that any given exercise is merely scratching the surface of a method. In order to be able to use it well, we often have to do exercises which seem trivial, or that they would be 'far simpler' if done in some other, usually later, way.
State machine can do much more than just detect sequences, and often faster and at lower power than FPGAs.
Due disclosure. As a physics undergrad, our year-long lab exercise was to build a quite comprehensive sine/square signal generator. It was suggested to us that the we use emitter coupled logic to implement the logic in the squarer. I went 'nah, ECL is old hat, I am sure that at 20MHz I can squeeze the speed out of simpler common emitter stages by reducing voltage swings and impedances. And I could.' Years later, I found myself learning ECL anyway to design a 3GHz ASIC, when it was the only viable route.
You're right, you can implement any sequence detector that way.
But that means that if your sequence is \$N\$ bits long, you'll be doing
- \$N\$ shifts
- \$N\$ 1-bit comparisons
- a logical AND of \$N\$ comparisons every time step
every time step. The first point means you need \$N\$ flipflops, the second and third means you need \$N\$ gates for the reference comparison and a \$N\$-input AND. That's a lot of resources, for sequences a bit longer than your cute little 4 bit. Worse even, assuming you only have dual input ANDs, a \$N\$-input AND takes roughly \$2N-1\$ 2-input AND gates in \$\log_2 N\$ combinatorial layers.
With the FSM, you'll need to two comparisons, and single combinatorical layer depth.
The other answers are correct, but what they haven't mentioned is that your solution is a state machine. It has one state for every possible combination of the most recent \$N\$ bits, which means there are a total of \$2^N\$ states.
Explicitly constructing a state machine to match only the particular sequence of interest results in \$N\$ states, which can be encoded in \$\log_2 N\$ bits. For larger values of \$N\$ this is a pretty dramatic reduction.