Why is serial EEPROM preferred over parallel EEPROM?
It is very simple. Number of pins and cost of packaging.
EEPROM devices are primarily used to store parametric data or characterization constants for a device. The typical scenario is to write very seldom and read typically once each time the host device boots up. For this type of application the relatively slow writing times of EEPROM are of little concern. And the reading time to load at most a few K-bytes of data from a serial device (SPI or I2C) is not normally an excessive time impact.
There is another factor that has played into the popularity of serial devices over parallel devices. That has been the migration of MCU devices from older microprocessor units with parallel busses to the much more prevalent modern types that have all their program storage memory and data memory built right on the chip. Often there is no longer a parallel bus option directly available. And in most applications there is very little interest in using up scads of pins to bit bang to a parallel peripheral.
In the early days, wires were cheap and transistors were expensive. These days it's the reverse. Hence why almost everything is done serially.
In the early days, chips weren't very sophisticated, and a CPU would power up and read the first thing it found on its memory bus at the starting address, so parallel EEPROMs effectively mimicked the DRAM that was hanging on the bus.
These days, DDR RAM is screaming away at gigahertz on huge wide buses, making a flash chip that could hang on the same bus would be prohibitively expensive and fairly pointless when modern CPUs have enough built-in intelligence (thanks to cheap small transistors) to boot from I²C/SPI flash.
With micros, these days the program flash and RAM is usually internal to the device. External storage like EEPROM can hang on an I²C bus, saving I/O pins for other functions whilst maintaining acceptable throughput. The fewer I/O pins you use, the smaller, cheaper and more energy efficient you get. Plus it's far easier to track two wires around a board than two 8/16/32-bit wide buses, with the associated EMC issues, etc., etc.
Don't forget there is a "half-way house" called SQI. That is a multiple parallel bit serial interface (it stands for Serial Quad Interface).
From a protocol point of view it is just the same as working with a normal serial interface, but instead of just one bit being transferred every clock, 4 bits can be transferred at once. Instead of a single data/clock, or din/dout/clock arrangement it has 4 data pins and one clock. This gives 4x the through put of a normal serial interface and doesn't require many more pins. In fact many SPI flash chips can also run in SQI mode without requiring any more than the existing 8 pins they already have. A significant increase in speed without any increase in real estate.
SQI is becoming a popular interface for faster loading of programs from external flash chips - not only used for simple microcontrollers, but also now often used for booting the BIOS of PCs, especially laptops, where space is a real concern.