Why is the serial connection faster than the parallel connection?
The problem is keeping the signals on a parallel bus clean and in sync at the target.
With serial "all you have to do", is be able to is extract the clock and as a result the data. You can help by creating lots of transitions, 8b/10b, bi-phase or manchester encoding, there are lots of schemes (yes this means you are adding even more bits). Yes, absolutely that one serial interface has to run N times faster than an N wide parallel bus in order to be "faster" but a long time ago now we reached that point.
Interestingly we now have parallel serial buses, your pcie, your ethernet (okay if you run 40GigE is 4 × 10 gig lanes 100Gig is 10 × 10 gig or the new thing coming is 4 × 25 gig lanes). Each of these lanes is an independent serial interface taking advantage of the "serial speed", but the overall transmission of data is split up load balanced down the separate serial interfaces, then combined as needed on the other side.
Obviously one serial interface can go no faster than one bit lane of a parallel bus all other things held constant. The key is with speed, routing, cables, connectors, etc keeping the bits parallel and meeting setup and hold times at the far end is the problem. You can easily run N times faster using one serial interface. Then there is the real estate from pins to pcboard to connectors. Recently there is a movement from instead of moving up from 10 gig ethernet to 40 gig using 4 × 10 gig lanes, to 25 gig per lane so one 25 gig pair or two 25 gig pairs to get 50 gig rather than four 10 gig pairs. Costing half-ish the copper or fiber in the cables and elsewhere. That marginal cost in server farms was enough to abandon the traditional path for industry standards and go off and whip one up on the side and roll it out in a hurry.
Pcie likewise, started with one or more serial interfaces with the data load balanced. Still uses serial lanes with the data load balanced an rejoined, the speeds increase each generation per serial interface rather than adding more and more serial pairs.
SATA is the serial version of PATA which is a direct decendent to IDE, not that serial was faster just that it is far easier to sync up with and extract a serial stream than it is to keep N parallel bits in sync from one end to the next. And remains easier to transmit and extract even if the serial stream is per bit lane 16, 32, 64, or more times faster than the parallel.
Why is the serial connection faster than the parallel connection?
You're making wrong assumptions. Take any serial connection. Now place 10 in parallel and call that the parallel version. Which one is faster ?
So how come the serial connection is considered the future while the parallel one as a thing of the past?
Says who ?
Parallel connections are still everywhere. For a fast short distance connection parallel beats serial anytime. For example the interface between a CPU and its RAM.
For long distance most connections are serial because the cost of multiple wires is higher. But in optical fiber we can use different wavelength signals through the same fiber. You could call that parallel.
The shift is from "parallel on a single clock" to "multiple serial links". Such as PCIe, where a card may have 1x to 16x "lanes".
There are two factors involved, skew and size.
Adding more connectors makes both the cable, its connectors and the receptacles on each device larger and more expensive. Look at how large things like Centronics printer cables and 40-pin SCSI cables were! You're not going to see a phone with a Centronics connector on. So as the devices get smaller there's pressure for smaller interfaces with fewer cables.
The devices have also got faster with better signal processing. So it's now possible to have much higher bitrates. However, this has a disadvantage for traditional parallel links: skew.
Skew is the difference in arrival times between signals in a group. The traditional parallel link has a single clock for all signals. It assumes that the clock and signals all arrive at roughly the same time. As the signals get faster, the tiny differences in arrival times become more important. This means that a wide parallel connection is limited in speed: you have to go slowly enough that all bits arrive within the same time window and are not overwritten by the next bit coming along.