What happens when an FPGA is "programmed"?
Judging by your other question, you're a Xilinx guy. So I highly suggest getting the data sheet for your Xilinx chip and going to the Functional Description chapter. For the Spartan 3 chip that I use, it's 42 pages of fun reading. It details exactly what components are inside an FPGA - the IOBs, CLBs, slices, LUTs, Block RAM, Multipliers, Digital Clock Manager, Clock Network, Interconnect, and some very basic configuration information. You need to understand this information if you want to know what a "compiled HDL" looks like.
Once you're familiar with your FPGA's architecture, then you can understand this process. First, your HDL design is run through the synthesis engine, which turns your HDL into basically RTL. Then the Mapper processes the results from Synthesis, "mapping" them onto the available pieces of FPGA architecture. Then the Router does Place And Route (PAR), which figures out where those pieces go and how to connect them. Finally, the results from PAR are turned into a BIT file. Typically this BIT file is then transformed in some way so that it can be loaded into a Flash chip, so that the FPGA can be programmed automatically when it powers up.
This bit file describes the entire FPGA program. For instance, the CLBs in a Spartan 3 are composed of slices, which are composed of LUTs, which are just 16-address 1-bit SRAMs. So one thing the BIT file will contain is exactly what data goes into each address of the SRAM. Another thing the BIT file contains is how each input of the LUT is wired to the connection matrix. The BIT file will also contain the initial values that go inside the block RAM. It will describe what is connected to the set and reset pins of each flip flop in each slice. It will describe how the carry chain is connected. It will describe the logic interface for each IOB (LVTTL, LVCMOS, LVDS, etc). It will describe any integrated pull-up or pull-down resistors. Basically, everything.
For Xilinx, the FPGA's memory is cleared when configuration is initiated (i.e. PROG_B is asserted). Once memory is clear, INIT_B goes high to indicate that phase is complete. The BIT file is then loaded, either through JTAG or the Flash chip interface. Once the program is loaded, the Global Set/Reset (GSR) is pulsed, resetting all flip flops to their initial state. The DONE pin then goes high, to indicate configuration is complete. Exactly one clock cycle later, the Global Three-State signal (GTS) is released, allowing outputs to be driven. Exactly one clock cycle later, the Global Write Enable (GWE) is released, allowing the flip flops to begin changing state in response to their inputs. Note that even this final configuration process can be slightly reordered depending on flags that are set in the BIT file.
EDIT:
I should also add that the reason the FPGA program is not permanent is because the logic fabric is composed of volatile memory (e.g. SRAM). So when the FPGA loses power, the program is forgotten. That's why they need e.g. Flash chips as non-volatile storage for the FPGA program, so that it can be loaded whenever the device is powered on.
Compiling the HDL results in a bit pattern which indicates which connections inside the FPGA should be activated. The FPGA doesn't have to interpret the HDL anymore. The bit pattern is programmed into a serial loader Flash/EEPROM, and upon booting this pattern is shifted into the FPGA, making the necessary connections.
The result of the compilation is a bitstream (literally a stream of bits) which is loaded in after power up. This shifts through the FPGA being stored in some memory cells (latches). These cells are connected to various logic entities, multiplexers, look-up tables, RAM blocks, routing matrices and constitute what is called the "configuration". Once the bitstream is loaded, the FPGA begins to operate - the bits in the configuration-latches "tell" each little piece of FPGA how to operate.
EDIT April 24, 2012: The flip-flops I mentioned aren't for the look-up tables or the configuration of them. As @ajs410 said those are in RAM which is even fewer transistors. THe flip-flops are for the storeage of the data out of the LUT, if that storage is enabled.