Why would an op-amp use BJTs over MOSFETs?
The 741 is an old piece of junk, primarily used to teach basic electronics for cheap. I seem to remember reading somewhere that if every 741 ever made were to be collected, there would be enough to give every person on earth 6 or 8 of them.
Modern op amps fall into several categories.
General Purpose - These op amps are not very fast, have bad non-ideal characteristics (bias currents in the nanoamps), drift, have input impedances in the megaohms, and cost almost nothing. The 741 falls in this category.
FET Inputs - These are a bit faster, have significantly better non-ideal characteristics (bias currents in the picoamps), drift very little, have extremely high input impedances (gigaohms), but may cost a few dollars.
CMOS - CMOS op amps are slow, but have excellent non-ideal characteristics (bias currents in the FEMTOamps), extremely high input impedance (TERAohms), drift about as much as general purpose op amps, and may cost a few dollars. This is the type of op amp that can get its output within millivolts of the rails, but rail voltage is limited.
Chopper Stabilized - This is another form of the CMOS op amp. It drifts very little, and has very low offsets. Take a look at this article for more information
There are other op amps out there that can handle RF frequencys, or handle high output currents, but they don't really fall in these catergories.
As you can see, each type of op amp has different non-ideal DC characteristics, and input impedance. How much current flows into the op amp inputs depends on the input impedance. For most modern op amps, these are very small currents, and can be considered negligible for the majority of applications. Which type of op amp you use is a design consideration, factoring in speed, cost, temperature range, and any precision concerns.
Bipolar opamps like the 741 or the LM324 have different tradeoffs than FET opamps. For one thing, they were designed many years ago when FET IC technology was less advanced relative to bipolar IC technology. It's unfair to call the 741 junk; it was something wonderful in its time. Its close derivative, the LM324, is still in volume production today, so obviously many people think it is the right tradeoff for their requirements.
One significant advantage of the LM324 is its price. Often enough you just need a opamp without very stringent requirements. If the 1 MHz gain×bandwidth product, the bias current, and the few mV of offset are all good enough, then everything else is just expensive junk.
In general, it's a little easier to get the offset voltage down to a few mV with bipolars for the same chip area. There are also advantages in current drive capability and supply voltage range. FETs of course have the really high input impedence. Nowadays these distinctions are more blurred. You can get FET input opamps with offset voltages well below a mV, but then compare their price to the LM324.
Early FET opamps, like the TL07x and TL08x had other problems, like very high input common mode range headroom at both ends. Nowadays, FET opamps are easier to make rail to rail for both input and output, but again compare the price of even the cheapest MCPxxxx with the old standby LM324. Also note the supply voltage range the LM324 can operate over. That's a tough trick for most of today's FET opamps.
Everything is a tradeoff.
MOSFETs are too noisy for many precision amplifier applications. If you have a low impedance source, for the lowest noise of any available monolithic amplifier, you need to go to a bipolar amplifier such as the LT1028 which has a white noise spectral density of 1.1nV/sqrt(Hz). (If that's not good enough, a discrete design can do better).
Contrast that with a typical MOSFET-input amplifier such as the MCP601, which is typically 29nV/sqrt(Hz), or about 700 times worse in terms of power.
If you are doing audiophile audio processing, the best amplifier in the world is a Texas Instruments (nee Burr-Brown) bipolar part. It has a lot of input bias current, but very little distortion.
MOSFET amplfiers also seldom are capable of working with higher supply voltages, such as +/-15V (another frequent requirement of precision instrumentation), and if they are, they tend to cost an arm and a leg, I think that's mostly because they have to be made on a special high-voltage CMOS process line and not mixed with digital stuff.
The 741 was designed in the mid-1960s, so almost 50 years ago. It was somewhat of an improvement over even earlier op-amps (such as the uA709), but it's pretty long in the tooth. Dual versions such as the venerable JRC 4558 have been used in audio applications for decades. As Olin points out, the LM324 is similar (the output stage has significant differences, in part to make it "single supply"), but costs only a penny or two per amplifier in quantity.
Aside from the LM324, I don't think any other op-amp has achieved as wide use as the 741 (maybe some of the JFET amplifiers come close)- the market is more balkanized, with many different choices for the designer, each with its own advantages and disadvantages. Vive la différence!