Question about cycle counting accuracy when emulating a CPU

  1. most emulators/simulators dealing just with CPU Clock tics

    That is fine for games etc ... So you got some timer or what ever and run the simulation of CPU until CPU simulate the duration of the timer. Then it sleeps until next timer interval occurs. This is very easy to simulate. you can decrease the timing error by the approach you are asking about. But as said here for games is this usually unnecessary.

    This approach has one significant drawback and that is your code works just a fraction of a real time. If the timer interval (timing granularity) is big enough this can be noticeable even in games. For example you hit a Keyboard Key in time when emulation Sleeps then it is not detected. (keys sometimes dont work). You can remedy this by using smaller timing granularity but that is on some platforms very hard. In that case the timing error can be more "visible" in software generated Sound (at least for those people that can hear it and are not deaf-ish to such things like me).

  2. if you need something more sophisticated

    For example if you want to connect real HW to your emulation/simulation then you need to emulate/simulate BUS'es. Also things like floating bus or contention of system is very hard to add to approach #1 (it is doable but with big pain).

    If you port the timings and emulation to Machine cycles things got much much easier and suddenly things like contention or HW interrupts, floating BUS'es are solving themselves almost on their own. I ported my ZXSpectrum Z80 emulator to this kind of timing and see the light. Many things get obvious (like errors in Z80 opcode documentation, timings etc). Also the contention got very simple from there (just few lines of code instead of horrible decoding tables almost per instruction type entry). The HW emulation got also pretty easy I added things like FDC controlers AY chips emulations to the Z80 in this way (no hacks it really runs on their original code ... even Floppy formating :)) so no more TAPE Loading hacks and not working for custom loaders like TURBO

    To make this work I created my emulation/simulation of Z80 in a way that it uses something like microcode for each instruction. As I very often corrected errors in Z80 instruction set (as there is no single 100% correct doc out there I know of even if some of them claim that they are bug free and complete) I come with a way how to deal with it without painfully reprogramming the emulator.

    Each instruction is represented by an entry in a table, with info about timing, operands, functionality... Whole instruction set is a table of all theses entries for all instructions. Then I form a MySQL database for my instruction set. and form similar tables to each instruction set I found. Then painfully compared all of them selecting/repairing what is wrong and what is correct. The result is exported to single text file which is loaded at emulation startup. It sound horrible but In reality it simplifies things a lot even speedup the emulation as the instruction decoding is now just accessing pointers. The instruction set data file example can be found here What's the proper implementation for hardware emulation

Few years back I also published paper on this (sadly institution that holds that conference does not exist anymore so servers are down for good on those old papers luckily I still got a copy) So here image from it that describes the problematics:

CPU Scheduling

  • a) Full throtlle has no synchronization just raw speed
  • b) #1 has big gaps causing HW synchronization problems
  • c) #2 needs to sleep a lot with very small granularity (can be problematic and slow things down) But the instructions are executed very near their real time ...
  • Red line is the host CPU processing speed (obviously what is above it take a bit more time so it should be cut and inserted before next instruction but it would be hard to draw properly)
  • Magenta line is the Emulated/Simulated CPU processing speed
  • alternating green/blue colors represent next instruction
  • both axises are time

[edit1] more precise image

The one above was hand painted... This one is generated by VCL/C++ program:

timing

generated by these parameters:

const int iset[]={4,6,7,8,10,15,21,23}; // possible timings [T]
const int n=128,m=sizeof(iset)/sizeof(iset[0]); // number of instructions to emulate, size of iset[]
const int Tps_host=25;  // max possible simulation speed [T/s]
const int Tps_want=10;  // wanted simulation speed [T/s]
const int T_timer=500;  // simulation timer period [T]

so host can simulate at 250% of wanted speed and simulation granularity is 500T. Instructions where generated pseudo-randomly...


Was a quite interesting article on Arstechnica talking about console simulation recently, also links to quite a few simulators that might make for quite good research:

Accuracy takes power: one man's 3GHz quest to build a perfect SNES emulator

The relevant bit is that the author mentions, and I am inclined to agree, that most games will appear to function pretty correctly even with timing deviations of +/-20%. The issue you mention looks likely to never really introduce more than a fraction of a percent timing error, which is probably imperceptible whilst playing the final game. The authors probably didn't consider it worth dealing with.