Why does Intel hide internal RISC core in their processors?
The real answer is simple.
The major factor behind the implementation of RISC processors was to reduce complexity and gain speed. The downside of RISC is the reduced instruction density, that means that the same code expressed in RISC like format needs more instructions than the equivalent CISC code.
This side effect doesnt means much if your CPU runs at the same speed as the memory, or at least if they both run at reasonably similar speeds.
Currently the memory speed compared to the CPU speed shows a big difference in clocks. Current CPUs are sometimes five times or more faster than the main memory.
This state of the technology favours a more dense code, something that CISC provides.
You can argue that caches could speed up RISC CPUs. But the same can be said about CISC cpus.
You get a bigger speed improvement by using CISC and caches than RISC and caches, because the same size cache has more effect on high density code that CISC provides.
Another side effect is that RISC is harder on compiler implementation. Its easier to optimize compilers for CISC cpus. etc.
Intel knows what they are doing.
This is so true that ARM has a higher code density mode called Thumb.
No, the x86 instruction set is certainly not deprecated. It is as popular as ever. The reason Intel uses a set of RISC-like micro-instructions internally is because they can be processed more efficiently.
So a x86 CPU works by having a pretty heavy-duty decoder in the frontend, which accepts x86 instructions, and converts them to an optimized internal format, which the backend can process.
As for exposing this format to "external" programs, there are two points:
- it is not a stable format. Intel can change it between CPU models to best fit the specific architecture. This allows them to maximize efficiency, and this advantage would be lost if they had to settle on a fixed, stable instruction format for internal use as well as external use.
- there's just nothing to be gained by doing it. With today's huge, complex CPU's, the decoder is a relatively small part of the CPU. Having to decode x86 instructions makes that more complex, but the rest of the CPU is unaffected, so overall, there's just very little to be gained, especially because the x86 frontend would still have to be there, in order to execute "legacy" code. So you wouldn't even save the transistors currently used on the x86 frontend.
This isn't quite a perfect arrangement, but the cost is fairly small, and it's a much better choice than designing the CPU to support two completely different instruction sets. (In that case, they'd probably end up inventing a third set of micro-ops for internal use, just because those can be tweaked freely to best fit the CPU's internal architecture)