I had the same thought and, indeed, that's how my emulator does it too. However, half way through posting my "there's tons of emulators out there that do cycle-by-cycle emulation, I re-read the original suggestion.thefox wrote:I initially thought that too, but actually it does emulate cycle by cycle. Check the functions which handle different addressing modes (AM_xxx), every MemGet()/MemSet() increases cycle count.NESICIDE wrote:I didn't come to that conclusion looking at the source for v0.975.tepples wrote:I seem to remember Nintendulator emulates cycle by cycle like that.
The *difference* between the way we do it and the suggestion is that the LDA function proposed takes a number of cycles to run as an argument, keeps track of the current cycle of the LDA instruction statically, and returns when the passed-in number of cycles has been exhausted or when the instruction completes [whichever occurs first]. So, the LDA function may be called with a cycle count of 1 in which case it would only execute *one* memory-access cycle part of the LDA and then return.
I do BRK/NMI/IRQ that way just so I can get the granularity necessary to pass the CPU interrupt tests. I thought about redoing my CPU core entirely to make every instruction have a cycles parameter and return when the cycles were depleted, not necessarily when the instruction was completed. But I wonder if an easier approach would be to just have a subordinate catch-up routine in the PPU that is called whenever the CPU notices that the number of cycles its been asked to emulate has been depleted. That way the PPU can change state mid-CPU-instruction and the CPU can react as necessary. Of course there'd need to be some way to stop the thing, otherwise the CPU would just keep asking for more from the PPU forever.
I think there's two "cycle-based" ideas that conflict. One is the independent cycle-based component idea, where each core runs one cycle at a time exactly as its hardware equivalent would do. Then there's the cycle-synced components idea where each component is cycle-based and also synced at its cycle-granularity to the other components. For the CPU/APU this is easy since they're 1:1. For the CPU/PPU whenever the CPU runs one cycle the PPU should have run 3 or 3.2. But, in both our cases the CPU could run ahead of the PPU because it might be doing multiple reads/writes to satisfy the completion of an LDA. In my emulator the first two cycles of any instruction are broken apart so that the PPU/CPU are cycle-synced for those two cycles. But once the instruction begins operation, the CPU will jump ahead of the PPU depending on how many cycles remain to be done for the instruction.
Intersting thought...I might have to crack open my CPU core again. 8)