Is it faster to have four times 2 GB or two times 4 GB of RAM with a dual-channel mainboard?
Four sticks would cause more strain on the memory controller and motherboard chipset. It would take slightly longer for the CPU to add and retrieve data from four sticks as opposed to two.
For this reason, 2 x 4 GB would be faster then 4 x 2 GB.
EDIT - there is a much better technical explanation on The Hyphenated Site to enforce my answer, although it does mention lower sized sticks:
It's better to use two 2 GB modules -- not for any appreciable speed difference (although there may be a small advantage -- more in a bit) -- but for a more reliable memory subsystem.
Most desktop systems use unbuffered RAM modules -- this results in very large loads on the address and data buses when you have more than two modules installed, and can significantly degrade the signalling on these buses. The memory subsystem "sees' one load per memory chip -- so with two modules installed, that's up to 32 loads (with double-sided modules) ... and with four modules installed that's as many as 64 electrical loads on the bus. Some systems automatically adjust for this higher load by either increasing the voltage a small amount; reducing the clock frequency of the memory (thus slightly slowing it down); or by adding a cycle to the SPD's latency setting (again, slightly slowing it down). These adjustments help keep the memory subsystem reliable -- but mean that 4 x 1 GB modules would be slightly slower than 2 x 2 GB modules on these systems. But regardless of whether there's any timing differences, the memory will definitely be more reliable with only two modules.
Credit to garycase at The Hyphenated Site for this answer.
2x4GB will be slightly faster in benchmarks because the motherboard has to relax timings slightly to handle two DIMMs per channel but in real world use you'll never notice a difference. It also can matter if you're overclocking significantly where the second set of DIMMs will limit your ability to push the memory clocks quite as high.
If costs are about the same, 2x4 GB will give you room to expand to 16 GB of memory in the future if you need it. If costs are significantly higher and you don't anticipate needing more RAM, there's no reason not to save money and use 4x2GB instead.
Not my answer but the one that I followed:
This question has come up before in several other forums, but there has never been a good clear answer. Overclocking, upgradability, and temperature aside, I would suspect that 4x2 would be slightly faster than 2x4 because there would be more interleaving. Interleaving means that the data is spread out across more memory chips. While some chips are waiting for their CAS timing cycle to complete, data access can occur from the other chips.
The thing is, it really depends on the northbridge (NB) implementation. The NB must be able to take advantage of interleaving across four DIMMs, and I don't know which NBs do and which don't. I suspect the most of the Xeon NBs can interleave across four DIMMs, in fact most of those NBs require DIMMs to be installed in groups of four. I'm not sure about the consumer NB lines such as P35, 965, and nForce 680i, the answer could be different for each chipset...
I know this does not directly answer your question. I am quite certain that 4x2 will never be slower than 2x4, it will either be the same or faster if the NB supports four-way interleave.