Hi everyone, I decided to try making an NES emulator over the last couple of days, and while it's mostly going fine, I've been caught up on a minor but perplexing point.
I've been using a few sites for documentation of the 6502 instruction set, and while https://www.masswerk.at/6502/6502_instruction_set.htm says that the absolute,X addressing mode takes up an extra cpu cycle if the sum $addr+X carries to the upper byte, I've found there are a number of instructions (opcodes 1E, 3E, 5E, 7E, 9D, DE and FE, along with some others in absolute,Y mode and undocumented instructions) which take a fixed number of cycles.
Furthermore, I tested this out using opcode DE on FCEUX, and it took 7 cycles whether the sum carried or not.
The only pattern I can see is that opcodes which would usually take fewer cpu cycles seem to increment and opcodes that take more seem not to.
Does anyone know why this is? If I had a 6502 based system I'd try it out for myself.
Extra Cycle for overflow on Absolute,X instructions
Moderator: Moderators
Re: Extra Cycle for overflow on Absolute,X instructions
Corrected link: https://www.masswerk.at/6502/6502_instruction_set.html
Indexed addressing modes (dd,X, dd,Y, aaaa,X, aaaa,Y, and (dd),Y) incur 1 penalty cycle to add the index to the base address. This penalty is negated if all three of these conditions are true:
Indexed addressing modes (dd,X, dd,Y, aaaa,X, aaaa,Y, and (dd),Y) incur 1 penalty cycle to add the index to the base address. This penalty is negated if all three of these conditions are true:
- The instruction is a read, not a write (sta, stx, sty) or a read-modify-write (asl, rol, lsr, ror, dec, inc).
- The addressing mode is absolute (aaaa,X or aaaa,Y) or indirect ((dd),Y), not zero page (dd,X or dd,Y).
- The sum does not carry into the high byte of the address.
- Six of them (1E, 3E, 5E, 7E, DE, and FE) are read-modify-write. The indexing penalty is never negated on RMW.
- One of them (9D) is sta which is a write. The indexing penalty is never negated on writes.
Re: Extra Cycle for overflow on Absolute,X instructions
If an opcode needs to perform 16-bit math to calculate an address, then the CPU needs to use two cycles: one to add the index to the LSB, and then another to carry into the MSB.
If the opcode is only a read, then the CPU is allowed to skip the cycle where it carries into the MSB if and only if the LSB didn't overflow, because the calculated address is already correct.
If the opcode performs a write of any kind, then the CPU is not allowed to skip that cycle, and will always carry into the MSB even if the LSB didn't overflow.
==== Why? ====
On the 6502, every cycle is a memory access of some kind, while it also performs some kind of internal calculation at the same time. Let's take LDA nnnn,X for example:
If the opcode is only a read, then the CPU is allowed to skip the cycle where it carries into the MSB if and only if the LSB didn't overflow, because the calculated address is already correct.
If the opcode performs a write of any kind, then the CPU is not allowed to skip that cycle, and will always carry into the MSB even if the LSB didn't overflow.
==== Why? ====
On the 6502, every cycle is a memory access of some kind, while it also performs some kind of internal calculation at the same time. Let's take LDA nnnn,X for example:
- Read PC to get an opcode.
- Read PC+1 while decoding opcode.
- Read PC+2 as MSB of address, while adding X to the value we read in the previous step.
- Read from address we have so far, while carrying into address MSB. (If LSB did not overflow, CPU is allowed to finish here. If LSB overflowed, then the wrong address is read here and the CPU needs to continue to the next step)
- Read from now-fully-calculated address.