Using transistor as switch, why is load always on the collector
A simple reason to have the load on the collector is that it keeps the base current independent of the load. That makes it much easier to reliably keep the transistor saturated.
If the load is on the emitter, then the base current depends on the load. If the load is an LED, then the voltage you have to apply to the transistor base to reach the needed current goes up by the forward voltage of the LED.
If the load is a motor and it is connected to the emitter, then the base current depends on the motor, and will vary all over the place as the motor turns.
It is not necessary to use a grounded emitter, but consider the alternative
simulate this circuit – Schematic created using CircuitLab
A transistor used as a switch (in saturation) will typically have a collector-emitter voltage of about 0.2 volts. Since the base-emitter voltage will be about 0.7 volts, Vs must be at least 0.5 volts above Vcc, plus whatever voltage is required across R2 to get the base current up to the level required. And that base current will be significant. Regardless of "ordinary" gain, an NPN transistor in saturation will display a much lower gain, with the typical rule of thumb being a gain of 10 to ensure low Vce. So the circuit as shown cannot be used without a second, higher power supply, which is not what you'd call convenient.
This, in turn, answers your third question. Since the transistor will be (by normal, linear standards) grossly overdriven, gain variations among transistors will typically have no obvious effect. In the circuit shown, a 50% voltage increase will cause the transistor voltage to increase from 0.2 volts to 0.3 volts, which will drop the load voltage from 4.8 to 4.7 volts, and for displays and LEDs and such this will be unnoticeable.
As to question 2, the answer is definitely yes. In many respects FETs and MOSFETs are easier to drive, since they require very little gate current (except during transitions). And, in fact, CMOS is the dominant technology for microprocessors and graphic chips, with potentially millions of transistors per chip. Well, actually, high-end CPUs and graphics ICs nowadays run between 1 and 2 billion transistors. Trying to do this with BJTs would simply be impossible due to the current requirements.
Not always. There are circuits called "emitter follower". They don't amplify voltage, but they do amplify input current.
Yes, for switching purposes FETs are used as well, n-channel for low-side switch, and p-channel for high-side switch.
If you make a BJT into saturation mode, different current gains do not matter as long as you supply enough base current to keep the transistor in saturation for the lowest manufacturer's specified gain.
If you drive a 7-segment LED display, you don't control current by controlling the transistor. You control the current/brightness by using a calculated current-limiting resistor, and pulse-width modulation of saturated switches. This approach eliminates transistor's variability.