FPGA's vs Microcontrollers
Designing for an FPGA requires a Hardware Description Language (HDL). HDLs are absolutely nothing at all like C. Whereas a C program is a sequential series of instructions and must contort itself to achieve parallel execution, an HDL describes a concurrent circuit and must contort itself to achieve sequential execution. It is a very different world and if you try to build a circuit in an FPGA while thinking like a software developer it will hurt.
An MCU is time-limited. In order to accomplish more work, you need more processor cycles. Clocks have very real limits to their frequencies, so it's easy to hit a computational wall. However, an FPGA is space-limited. In order to accomplish more work, you merely add more circuits. If your FPGA isn't big enough, you can buy a bigger one. It's very hard to build a circuit that can't fit in the largest FPGA, and even if you do there are app notes describing how to daisy chain FPGAs together.
FPGAs focus way more on parallel execution. Sometimes you have to worry about how long your MCU's ISR takes to service the interrupt, and whether you'll be able to achieve your hard-real-time limits. However, an in FPGA there are lots of Finite State Machines (FSM) running all the time. They are like "femto-controllers", like little clouds of control logic. They are all running simultaneously, so there's no worrying about missing an interrupt. You might have an FSM to interface to an ADC, another FSM to interface to a microcontroller's address/data bus, another FSM to stream data to a stereo codec, yet another FSM to buffer the dataflow from the ADC to the codec...You need to use a simulator to make sure that all the FSMs sing in harmony. If any control logic is off by even a single clock cycle (and such mistakes are easy to make) then you will get a cacophony of failure.
FPGAs are a PCB layout designer's wet dream. They are extremely configurable. You can have many different logic interfaces (LVTTL, LVCMOS, LVDS, etc), of varying voltages and even drive strengths (so you don't need series-termination resistors). The pins are swappable; have you ever seen an MCU address bus where the pins were scattered around the chip? Your PCB designer probably has to drop a bunch of vias just to tie all the signals together correctly. With an FPGA, the PCB designer can then run the signals into the chip in pretty much any order that is convenient, and then the design can be back-annotated to the FPGA toolchain.
FPGAs also have lots of nice, fancy toys. One of my favorites is the Digital Clock Manager in Xilinx chips. You feed it one clock signal, and it can derive four more from it using a wide variety of frequency multipliers and dividers, all with pristine 50% duty cycle and 100% in phase...and it can even account for the clock skew that arises from propagation delays external to the chip!
EDIT (reply to addendum):
You can place a "soft core" into an FPGA. You're literally wiring together a microcontroller circuit, or rather you're probably dropping someone else's circuit into your design, like Xilinx's PicoBlaze or MicroBlaze or Altera's Nios. But like the C->VHDL compilers, these cores tend to be a little bloated and slow compared to using an FSM and datapath, or an actual microcontroller. The development tools can also add significant complexity to the design process, which can be a bad thing when FPGAs are already extremely complex chips.
There are also some FPGAs that have "hard cores" embedded in them, like Xilinx's Virtex4 series that have a real, dedicated IBM PowerPC with FPGA fabric around it.
EDIT2 (reply to comment):
I think I see now...you're asking about connecting a discrete MCU to an FPGA; i.e. two separate chips. There are good reasons to do this; the FPGA's that have hard cores and the ones that are big enough to support decent soft cores are usually monsters with many hundreds of pins that end up requiring a BGA package, which easily increases the difficulty of designing a PCB by a factor of 10.
C is a lot easier, though, so MCUs definitely have their place working in tandem with an FPGA. Since it's easier to write C, you might write the "brains" or the central algorithm in the MCU, while the FPGA can implement sub-algorithms that might need accelerated. Try to put things that change into the C code, because it's easier to change, and leave the FPGA to be more dedicated type stuff that won't change often.
MCU design tools are also easier to use. It takes several minutes for the design tools to build the FPGA bit file, even for somewhat simple designs, but complex MCU programs usually take a few seconds. There's much, much less to go wrong with the MCU, so they're also easier to debug...I cannot understate how complex FPGAs can be. You really need to get the datasheet for the one you have, and you should try to read every page of it. I know, it's a few hundred pages...do it anyway.
The best way to connect them is to use an MCU with an external address and data bus. Then you can simply memory map the FPGA circuits into the MCU, and add your own "registers" that each have their own address. Now the FPGA can add custom peripherals, like a 32-bit timer that can latch all 4 bytes at once when the first byte is read to prevent overflows between 8-bit reads. You can also use it as glue logic to memory map more peripherals from other chips, like a separate ADC.
Finally, some MCUs are designed for use with an "external master" like an FPGA. Cypress makes a few USB MCUs that have an 8051 inside, but the intent is for the USB data to be produced/consumed by e.g. an FPGA.
"examples in the real world ... combining FPGAs with microcontrollers?"
In principle, a sufficiently large FPGA alone can do anything that a FPGA plus a microcontroller can do -- perhaps by implementing a soft CPU inside the FPGA. In practice, a given level of performance often has lower parts costs and requires lower power when implemented with a FPGA plus a separate microcontroller than with FPGAs alone (or MCUs alone). Here are a few of the more famous devices with both FPGAs and microcontrollers onboard:
- The Elphel camera; Elphel Project Wiki has a Xilinx (R) Spartan 3e 1200K gates FPGA and a ETRAX FS processor running GNU/Linux.
- The TS-7500 has a 5000 LUT Lattice FPGA and a 250MHz Cavium ARM9 CPU that can run Linux.
- The Balloon board has a Xilinx Spartan FPGA and a ARM CPU
- several Teeny weeny Linux SBCs include both a FPGA and a CPU
- The Armadeus Project wiki documents a few boards board with both a Xilinx Spartan-3 FPGA and a 400 MHz ARM9 CPU.
- The Blackfin Handy Board includes both a Xilinx Spartan 3e FPGA and a 600 MHz Analog Devices Blackfin® ADSP-BF537 processor. (It doesn't have a MMU, so it can't run full Linux, but it can run uClinux).
- The "Minimig" (mini Amiga) includes a Xilinx Spartan-3 FPGA, a M68000 CPU, and a small PIC MCU as acting disc controller.
Often FPGAs get used specifically to do tasks a microcontroller cannot do efficiently, such as highly parallel or low latency operations, operating in multiple clock domains, or doing custom logic at hardware speeds. As such, they'll do the heavy lifting, and you rarely need an MCU to be central to the design - they may be moved to management positions, such as loading the configuration bitstream. An example of this is the PIC or ARM in the Minimig, which implements the storage interface.
A few products blur the lines, however. Some examples:
- Larger FPGAs tend to have hard CPUs built in (larger projects often need them anyhow), just as they have RAM and multiplier blocks
- Some microcontrollers aim at parallel operations (XMOS XS1, Atmel Xmega, GreenArray, Parallax Propeller)
- Some chips are designed as hybrids (Cypress PSoC, Atmel FPSLIC)
Coming from an imperative programming background, it is quite an adjustment to design in hardware as you need to gain the advantages of FPGAs. You'll find the experience useful elsewhere too, however.