Understand the timing of Shift Register
That wikipedia link had an incorrect description, it was for a cascaded divide by 2 counter. I have now fixed up the Wikipedia entry so it describes a shift register. It may not be the best description (done in a minute), but at least it isn't plain wrong!
My question is, since the Clock is shared by all the DFFs, when the rising edge arrives, all the 4 DFFs must be in triggered/transparent state. So what ensures that the data propagate through only 1 stage of DFF rather than 2 or more stages?
Consider the timing requirements of a typical D Flip Flop.
As you can see, there are a number of parameters; of most importance here are setup time, hold time and propagation delay.
The input (at D) must be stable across the period shown (from \$ t_{su}\$ to \$t_h\$).
For this particular part, the minimum hold time required is 3nsec. This is the minimum amount of time the input must be stable behind the clock for guaranteed performance (i.e. D is transferred to Q)
Any transition of the input after this is effectively ignored by the device. Provided the clock period is greater than \$t_{su}\$ + \$t_h\$ it will operate properly.
Now lets look at the propagation delays (\$t_{plh}\$ and \$t_{phl}\$)
From the datasheet, these are typically 14 nsec:
As this event will occur beyond the hold time requirement, this transition at the next D input will have no effect as the input is now effectively locked out by the internal feedback mechanism.
Note that the maximum rate a shift register can go is \$ \frac {1} {t_{su(min)} \ + t_{prop(max)}} \$ as the D input must be stable for at least the setup time after the previous Q output has become stable.
Provided the propagation delay is greater than the hold time, it can be ignored for the maximum clock rate.
So the bottom line is that provided the propagation delay from D to Q is greater than the required hold time, a single clock event cannot propagate across more than one flip flop.
Using your diagram, the clock happens at some time \$t_0\$. The output \$Q_1\$ will change after the propagation delay of the first flip flop, but as this is going to be after the input setup time at the second flip flop provided the propagation delay of the first flip flop is greater than the input hold time of the second flip flop (it always is, in my experience), then the transition at Q1 (D of the second flip flop) has no effect for this clock event.
For the timings to work correctly, the hold time on the D input must be smaller then the propagation delay of the flipflop minus the clock uncertainty. As long as this condition is true the new data from the previous flipflop will not change until after the next stage has latched the data.
For an NXP 74HC74 we see from the datasheet that hold time is 3ns worst case, and propagation delay is typically 14ns, so a clock skew of up to 11ns will not case a problem (At 5V) with these timings. Note however that propagation delay is typical not minimum so the margins on a particular part may be much tighter, however for fairly obvious reasons most D flipflops are designed so that this sort of thing works timing wise, as getting a particularly fast one will also likely have shorter hold time requirements.