Does 6502 instruction fetching count as a memory read?
Moderator: Moderators
Does 6502 instruction fetching count as a memory read?
I'm using the same function memory_read() to resolve addressing modes, but also to fetch the opcode at PC, and depending on the instruction length, fetching 1 or 2 more bytes starting at PC+1.
The question that arises is if this reads into memory count as regular memory reads, in the sense that, for example, suppose that for some reason the PC is pointing to one byte before a memory-mapped register, then fetching the opcode will be fine, but fetching its operand, if any, will have a side effect of reading the register.
Or it may even happen, I don't know if any game uses this technique, but suppose that the PC at some point has the address of a memory-mapped register, would the opcode fetch read the register?
The question that arises is if this reads into memory count as regular memory reads, in the sense that, for example, suppose that for some reason the PC is pointing to one byte before a memory-mapped register, then fetching the opcode will be fine, but fetching its operand, if any, will have a side effect of reading the register.
Or it may even happen, I don't know if any game uses this technique, but suppose that the PC at some point has the address of a memory-mapped register, would the opcode fetch read the register?
LDA $1234,X will fetch both $1234 and $1234+X I think. Not 100% sure on this, but I think that's one of them that does a dummy read.
The read-write-modify instructions (INC, ASL, etc) do dummy writes, and the MMC1 chip only sees the dummy write, and not the final write. Bill & Ted's excellent adventure and a few other games rely on this.
The read-write-modify instructions (INC, ASL, etc) do dummy writes, and the MMC1 chip only sees the dummy write, and not the final write. Bill & Ted's excellent adventure and a few other games rely on this.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Bill & Ted's Excellent Video Game Adventure does this:
You might think it would read $FF, then write $00
But it actually has a dummy write, so it will read $FF, write $FF, then write $00 (this impacts an MMC1 register)
Ironsword does this:
which READS $4015 (acknowledging the frame IRQ) before writing to $4015. Failure to do the dummy read may be why this game was so often broken in emulators.
Read this doc for outlines on what reads/writes are performed when and where (there's charts at the bottom)
http://nesdev.com/6502_cpu.txt (edit2: specifically, read the "6510 Instruction Timing" section. Ctrl+F the file to get to it)
EDIT: doh, I'm too slow.
Code: Select all
INC $FFFF ; where $FFFF = $FF
But it actually has a dummy write, so it will read $FF, write $FF, then write $00 (this impacts an MMC1 register)
Ironsword does this:
Code: Select all
STA $4000,X ; where X = $15
Read this doc for outlines on what reads/writes are performed when and where (there's charts at the bottom)
http://nesdev.com/6502_cpu.txt (edit2: specifically, read the "6510 Instruction Timing" section. Ctrl+F the file to get to it)
EDIT: doh, I'm too slow.
Tepples, what was the <() notation? I think I knew it but I forgot. Is it a bitshift?tepples wrote:It actually fetches three bytes of the instruction, then $1200 + <($34+X) (with the bytes added without carry), then $1234+X if it differs.Dwedit wrote:LDA $1234,X will fetch both $1234 and $1234+X I think.
Disch: thanks for the link!
Low byte. E.g. <$6789 is $89 whereas >$6789 is $67.Petruza wrote:Tepples, what was the <() notation? I think I knew it but I forgot. Is it a bitshift?tepples wrote:It actually fetches three bytes of the instruction, then $1200 + <($34+X) (with the bytes added without carry), then $1234+X if it differs.Dwedit wrote:LDA $1234,X will fetch both $1234 and $1234+X I think.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
Oh, so this is what happens always on absolute indexed addressing? the indexing occurs first on the low byte only?tepples wrote:It actually fetches three bytes of the instruction, then $1200 + <($34+X) (with the bytes added without carry), then $1234+X if it differs.Dwedit wrote:LDA $1234,X will fetch both $1234 and $1234+X I think.
Damn, my cpu interpreter will take longer than I thought.
Yes. The 6502 has one 8-bit ALU and one 16-bit upcounter (for PC). To calculate a,x or a,y absolute indexed addressing in an instruction other than sta, stx, or sty, it uses the 8-bit ALU to first calculate the low byte while it fetches the high byte. If there's a carry out, it goes "oops", applies the carry using the ALU, and repeats the read at the correct address. Store instructions always have this "oops" cycle: the CPU first reads from the partially added address and then writes to the correct address. The same thing happens on (d),y indirect addressing.Petruza wrote:Oh, so this is what happens always on absolute indexed addressing? the indexing occurs first on the low byte only?tepples wrote:It actually fetches three bytes of the instruction, then $1200 + <($34+X) (with the bytes added without carry), then $1234+X if it differs.Dwedit wrote:LDA $1234,X will fetch both $1234 and $1234+X I think.
As for (d,x) indirect addressing, I don't know how many cycles that takes. As a programmer, I've never had a chance to use it in a loop. The only time I've ever used a table of addresses on zero page was in a music engine, where x had the possibility of being 0, 4, 8, or 12.
All ($xx,X) instructions takes 6 cycles according to this. : http://www.6502.org/tutorials/6502opcodes.html
However this page contains at least 1 error : and $xx is 3 cycles and not 2.
For the 6 cycles it should be like that :
1) Read opcode
2) Read argument
3) Read $xx+X
4) Read $xx+X+1
5) Read [$xx+X]
So there is one more dummy read in the process too.
Also maybe I used this instruction with X = 00 in all cases, if Y was busy with something else, I'm not sure exactly, but it doesn't really count.
However this page contains at least 1 error : and $xx is 3 cycles and not 2.
For the 6 cycles it should be like that :
1) Read opcode
2) Read argument
3) Read $xx+X
4) Read $xx+X+1
5) Read [$xx+X]
So there is one more dummy read in the process too.
It's fun, because I also only used the lda ($xx,X) instruction only one time for my music engine. Sounds like it's a common way to do it.As for (d,x) indirect addressing, I don't know how many cycles that takes. As a programmer, I've never had a chance to use it in a loop. The only time I've ever used a table of addresses on zero page was in a music engine, where x had the possibility of being 0, 4, 8, or 12.
Also maybe I used this instruction with X = 00 in all cases, if Y was busy with something else, I'm not sure exactly, but it doesn't really count.
Useless, lumbering half-wits don't scare us.
Close @ the reads. It is 6 cycles like you said, but you only listed 5 of them.
The doc I linked to before lays it out: http://nesdev.com/6502_cpu.txt
1) Read PC (opcode)
2) Read PC+1 (argument)
3) Read $xx (dummy read, gives time to add X to $xx)
4) Read $xx+X (low byte)
5) Read $xx+X+1 (high byte)
6) Read ($xx+X) (final data)
The doc I linked to before lays it out: http://nesdev.com/6502_cpu.txt
1) Read PC (opcode)
2) Read PC+1 (argument)
3) Read $xx (dummy read, gives time to add X to $xx)
4) Read $xx+X (low byte)
5) Read $xx+X+1 (high byte)
6) Read ($xx+X) (final data)
There are some dummy read test ROMs available.