Page 1 of 1
DMC DMA causing double-$2002 reads?
Posted: Wed Jun 20, 2012 2:59 pm
by cpow
I was just dorking around more with Visual2A03 [my favorite topic lately], trying to completely understand the DMC DMA and its impact. I recognize that there's lots of talk about how a DMC DMA can cause an extra controller read. I happened to be just randomly trying to find out how many cycles RDY is held in different situations, and created the following
program.
The output of this surprised me a bit, and got me thinking...DMC DMA could maybe cause skipped frames.
Here's the relevant portion of the log:
Code: Select all
cycle ab db rw Fetch pc a x y s p c_rdy
34 4015 40 0 0015 10 c0 00 bd nv‑Bdizc 1
34 4015 10 0 0015 10 c0 00 bd nv‑Bdizc 1
35 0015 ad 1 LDA Abs 0015 10 c0 00 bd nv‑Bdizc 1
35 0015 ad 1 LDA Abs 0015 10 c0 00 bd nv‑Bdizc 1
36 0016 02 1 0016 10 c0 00 bd nv‑Bdizc 1
36 0016 02 1 0016 10 c0 00 bd nv‑Bdizc 1
37 0017 20 1 0017 10 c0 00 bd nv‑Bdizc 1
37 0017 20 1 0017 10 c0 00 bd nv‑Bdizc 1
38 2002 00 1 0018 10 c0 00 bd nv‑Bdizc 1
38 2002 00 1 0018 10 c0 00 bd nv‑Bdizc 0
39 2002 00 1 0018 00 c0 00 bd nv‑BdiZc 0
39 2002 00 1 0018 00 c0 00 bd nv‑BdiZc 0
40 c000 e0 1 0018 00 c0 00 bd nv‑BdiZc 0
40 c000 e0 1 0018 00 c0 00 bd nv‑BdiZc 0
41 2002 00 1 0018 e0 c0 00 bd Nv‑Bdizc 1
41 2002 00 1 0018 e0 c0 00 bd Nv‑Bdizc 1
42 0018 10 1 BPL 0018 00 c0 00 bd nv‑BdiZc 1
42 0018 10 1 BPL 0018 00 c0 00 bd nv‑BdiZc 1
At cycle 38, read $2002 has gone on the bus and RDY is asserted. Then in cycle 40 the DMC DMA happens. Then cycle 41, read $2002 goes back on the bus.
The pattern above is identical regardless of the address being read, obviously...so it makes me think that the above pattern is representative of the $4016/$4017 extra reads.
Posted: Wed Jun 20, 2012 3:10 pm
by Disch
AFAIK the DMC read can affect every register read, including $2002, $2007, $4015, etc.
With $2002 it doesn't really matter, because reading that to check for VBlank is already unreliable, even without the DMC as a factor.
Posted: Wed Jun 20, 2012 8:24 pm
by Drag
Out of curiosity, does the DMC affect writes to 2007, or just reads? I think I still don't quite understand what exactly is going on.
Posted: Wed Jun 20, 2012 8:35 pm
by Dwedit
I'm guessing no, otherwise you'd see glitches all over the place in any game that uses DMC.
Also that DMC takes one cycle less when it falls on a write.
Posted: Wed Jun 20, 2012 9:26 pm
by Disch
Writes are unaffected.
I swear blargg went over all this in a post. Let me see if I can find it.
EDIT:
Here:
http://nesdev.com/bbs/viewtopic.php?p=33564#33564
Posted: Wed Jun 20, 2012 9:33 pm
by cpow
Dwedit wrote:I'm guessing no, otherwise you'd see glitches all over the place in any game that uses DMC.
Also that DMC takes one cycle less when it falls on a write.
I've been doing a lot of looking at writes...and never bothered to try to explain what I'm seeing. Here goes.
Here's a
program that does everything exactly the same as the previous program in this thread except the LDA $2002 is now a STA $2002. [I know...I know...that's an invalid NES thing to do but this isn't a NES it's just a 2A03.]
Relevant portion of the log:
Code: Select all
34 4015 40 0 0015 10 c0 00 bd nv‑Bdizc 1
34 4015 10 0 0015 10 c0 00 bd nv‑Bdizc 1
35 0015 8d 1 STA Abs 0015 10 c0 00 bd nv‑Bdizc 1
35 0015 8d 1 STA Abs 0015 10 c0 00 bd nv‑Bdizc 1
36 0016 02 1 0016 10 c0 00 bd nv‑Bdizc 1
36 0016 02 1 0016 10 c0 00 bd nv‑Bdizc 1
37 0017 20 1 0017 10 c0 00 bd nv‑Bdizc 1
37 0017 20 1 0017 10 c0 00 bd nv‑Bdizc 1
38 2002 20 0 0018 10 c0 00 bd nv‑Bdizc 1
38 2002 10 0 0018 10 c0 00 bd nv‑Bdizc 1
39 0018 10 1 BPL 0018 10 c0 00 bd nv‑Bdizc 1
39 0018 10 1 BPL 0018 10 c0 00 bd nv‑Bdizc 0
40 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
40 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
41 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
41 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
42 c000 e0 1 CPX # 0019 10 c0 00 bd nv‑Bdizc 0
42 c000 e0 1 CPX # 0019 10 c0 00 bd nv‑Bdizc 0
43 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 1
43 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 1
44 0019 fe 1 0019 10 c0 00 bd nv‑Bdizc 1
44 0019 fe 1 0019 10 c0 00 bd nv‑Bdizc 1
Not only does the DMC DMA not interrupt the write, it is *stalled* beyond the write. In the previous example using LDA $2002 the DMC DMA occurs between cycles 38 and 40. In the STA $2002 case above we can see RDY is not pulled low in cycle 38 as in the LDA $2002 case. Cycle 38 is a write. It is pulled low in cycle 39 which is a read. Further, this DMA takes four total cycles [counting from the assertion of RDY in 39 to the completion of the DMA in 42]. This too is contrary to the stated "DMC DMA takes one cycle less on a write". In the LDA $2002 example the DMA starts in cycle 38 and completes in cycle 40 -- three cycles. DMC DMA appears to take one cycle less in a read. I've seen similar results with trying to cause DMC DMA during an interrupt sequence. If the DMA would otherwise occur during one of the stack-write cycles, it is delayed until those write cycles complete then RDY is pulled low during the vector fetch. More to come...
Posted: Wed Jun 20, 2012 9:53 pm
by cpow
cpow wrote:More to come...
This
program shows a DMC DMA occurring during the vector fetch phase of an IRQ.
Relevant log parts:
Code: Select all
37 01bd 10 0 0015 10 c0 00 bd nv‑bdizc 1 0
37 01bd 00 0 0015 10 c0 00 bd nv‑bdizc 1 0
38 01bc 15 0 0015 10 c0 00 bd nv‑bdizc 1 0
38 01bc 15 0 0015 10 c0 00 bd nv‑bdizc 1 0
39 01bb 15 0 0015 10 c0 00 bd nv‑bdizc 1 0
39 01bb 20 0 0015 10 c0 00 bd nv‑bdizc 1 0
40 fffe 40 1 0015 10 c0 00 ba nv‑bdizc 1 0
40 fffe 40 1 0015 10 c0 00 ba nv‑bdizc 0 0
41 fffe 40 1 0015 10 c0 00 ba nv‑bdizc 0 0
41 fffe 40 1 0015 10 c0 00 ba nv‑bdizc 0 0
42 c000 e0 1 0015 10 c0 00 ba nv‑bdizc 0 0
42 c000 e0 1 0015 10 c0 00 ba nv‑bdizc 0 0
43 fffe 40 1 0015 10 c0 00 ba nv‑bdizc 1 0
43 fffe 40 1 0015 10 c0 00 ba nv‑Bdizc 1 0
44 ffff 00 1 0015 10 c0 00 ba nv‑BdIzc 1 0
44 ffff 00 1 0015 10 c0 00 ba nv‑BdIzc 1 0
Here the RDY assertion is delayed again. Following the LDA $2002 example I would expect RDY to assert in cycle 38 [nothing about the DMC setup has changed, DMC DMA occurs between cycles 38-40 if 38 is a read cycle]. Instead, it's as if the 2A03 has additional logic that does its own "RDY processing" -- meaning it won't even bother the CPU during a write cycle because it knows the CPU won't halt, so it waits until the next read cycle, then asserts RDY and starts the 3-cycle DMA.
Posted: Wed Jun 20, 2012 10:06 pm
by cpow
I think the conclusion to draw is that the actual DMC DMA -- the memory read -- will always occur coincident with where a sprite DMA memory read cycle would occur if sprite DMA were also occurring. [Someone mentioned this possibility in another thread that I'm not finding...] That has been shown in the Visual2A03 program traces where I interrupted sprite DMA with one or two DMC DMAs. The memory read cycle of the read/write sprite DMA beats is where the DMC memory read occurs. So, the "whether a DMC DMA takes 3 or 4 cycles when not also during sprite DMA" is most likely because of this synchronization, not at all because of the read/write state of the CPU's external bus. That state, it seems, drives another hold-off circuit that prevents even the assertion of RDY.
Posted: Thu Jun 21, 2012 7:45 am
by cpow
If anyone else finds this interesting...it appears that DMC DMA always takes 3 cycles if the RDY occurs coincident with a CPU memory-read [cases 1 and 2 below]. If the RDY would occur coincident with a CPU memory write, it is held off until the next CPU read cycle. If it's held off one CPU write cycle that results in a 4-cycle DMC DMA [case 3 below]. If it's held off for two CPU write cycles that results in a 3-cycle DMC DMA [case 4 below]. Maintaining synchronization with the 2-cycle DMA read/write beat.
1. DMC started at cycle 36. 3-cycle DMC DMA (cycles 40-42) occurs. This is the "control" because there's nothing going on to hinder when the DMC DMA could occur -- no writes, no sprite DMA.
Code: Select all
36 4015 40 0 0016 10 c0 00 bd nv‑Bdizc 1
36 4015 10 0 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 ad 1 LDA Abs 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 ad 1 LDA Abs 0016 10 c0 00 bd nv‑Bdizc 1
38 0017 02 1 0017 10 c0 00 bd nv‑Bdizc 1
38 0017 02 1 0017 10 c0 00 bd nv‑Bdizc 1
39 0018 20 1 0018 10 c0 00 bd nv‑Bdizc 1
39 0018 20 1 0018 10 c0 00 bd nv‑Bdizc 1
40 2002 00 1 0019 10 c0 00 bd nv‑Bdizc 1
40 2002 00 1 0019 10 c0 00 bd nv‑Bdizc 0
41 2002 00 1 0019 00 c0 00 bd nv‑BdiZc 0
41 2002 00 1 0019 00 c0 00 bd nv‑BdiZc 0
42 c000 e0 1 0019 00 c0 00 bd nv‑BdiZc 0
42 c000 e0 1 0019 00 c0 00 bd nv‑BdiZc 0
2. DMC started at cycle 37. 3-cycle DMC DMA (cycles 40-42) occurs. This is similar to the "control" case above except the DMC is started one cycle later. I believe this shows the 2-beat synchronization of the sprite/DMC DMA engines, with reads occurring on even cycles, writes on odd cycles [sprite DMA only].
Code: Select all
37 4015 40 0 0017 10 c0 00 bd nv‑Bdizc 1
37 4015 10 0 0017 10 c0 00 bd nv‑Bdizc 1
38 0017 ad 1 LDA Abs 0017 10 c0 00 bd nv‑Bdizc 1
38 0017 ad 1 LDA Abs 0017 10 c0 00 bd nv‑Bdizc 1
39 0018 02 1 0018 10 c0 00 bd nv‑Bdizc 1
39 0018 02 1 0018 10 c0 00 bd nv‑Bdizc 1
40 0019 20 1 0019 10 c0 00 bd nv‑Bdizc 1
40 0019 20 1 0019 10 c0 00 bd nv‑Bdizc 0
41 0019 20 1 001a 10 c0 00 bd nv‑Bdizc 0
41 0019 20 1 001a 10 c0 00 bd nv‑Bdizc 0
42 c000 e0 1 001a 10 c0 00 bd nv‑Bdizc 0
42 c000 e0 1 001a 10 c0 00 bd nv‑Bdizc 0
3. DMC started at cycle 36. 4-cycle DMC DMA (cycles 41-44) immediately following a memory write (cycle 40) by the CPU. I believe this is a write that "gets in the way of" the DMA/RDY because it occurs the cycle where the RDY assertion occurs in the write case above.
Code: Select all
36 4015 40 0 0016 10 c0 00 bd nv‑Bdizc 1
36 4015 10 0 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 95 1 STA zp,X 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 95 1 STA zp,X 0016 10 c0 00 bd nv‑Bdizc 1
38 0017 81 1 0017 10 c0 00 bd nv‑Bdizc 1
38 0017 81 1 0017 10 c0 00 bd nv‑Bdizc 1
39 0081 00 1 0018 10 c0 00 bd nv‑Bdizc 1
39 0081 00 1 0018 10 c0 00 bd nv‑Bdizc 1
40 0041 00 0 0018 10 c0 00 bd nv‑Bdizc 1
40 0041 10 0 0018 10 c0 00 bd nv‑Bdizc 1
41 0018 10 1 BPL 0018 10 c0 00 bd nv‑Bdizc 1
41 0018 10 1 BPL 0018 10 c0 00 bd nv‑Bdizc 0
42 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
42 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
43 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
43 0018 10 1 BPL 0019 10 c0 00 bd nv‑Bdizc 0
44 c000 e0 1 CPX # 0019 10 c0 00 bd nv‑Bdizc 0
44 c000 e0 1 CPX # 0019 10 c0 00 bd nv‑Bdizc 0
4. DMC started at cycle 36. 3-cycle DMC DMA (cycles 42-44) immediately following a dual memory write (cycles 40 and 41) by the CPU. These two writes delay RDY assertion but do not result in a 4-cycle DMA.
Code: Select all
36 4015 40 0 0016 10 c0 00 bd nv‑Bdizc 1
36 4015 10 0 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 c6 1 DEC zp 0016 10 c0 00 bd nv‑Bdizc 1
37 0016 c6 1 DEC zp 0016 10 c0 00 bd nv‑Bdizc 1
38 0017 80 1 0017 10 c0 00 bd nv‑Bdizc 1
38 0017 80 1 0017 10 c0 00 bd nv‑Bdizc 1
39 0080 00 1 0018 10 c0 00 bd nv‑Bdizc 1
39 0080 00 1 0018 10 c0 00 bd nv‑Bdizc 1
40 0080 00 0 0018 10 c0 00 bd nv‑Bdizc 1
40 0080 00 0 0018 10 c0 00 bd nv‑Bdizc 1
41 0080 55 0 0018 10 c0 00 bd Nv‑Bdizc 1
41 0080 ff 0 0018 10 c0 00 bd Nv‑Bdizc 1
42 0018 10 1 BPL 0018 10 c0 00 bd Nv‑Bdizc 1
42 0018 10 1 BPL 0018 10 c0 00 bd Nv‑Bdizc 0
43 0018 10 1 BPL 0019 10 c0 00 bd Nv‑Bdizc 0
43 0018 10 1 BPL 0019 10 c0 00 bd Nv‑Bdizc 0
44 c000 e0 1 CPX # 0019 10 c0 00 bd Nv‑Bdizc 0
44 c000 e0 1 CPX # 0019 10 c0 00 bd Nv‑Bdizc 0
Posted: Thu Jun 21, 2012 8:22 am
by Nessie
cpow wrote:If anyone else finds this interesting...
Just want to say I find this interesting but can't post any meaningful response because I'm having trouble following what's actually going on

Have to do some testing myself with blargg's framework before I'll be able to wrap my head around this.
Posted: Thu Jun 21, 2012 10:33 am
by cpow
Nessie wrote:cpow wrote:If anyone else finds this interesting...
Just want to say I find this interesting but can't post any meaningful response because I'm having trouble following what's actually going on

Have to do some testing myself with blargg's framework before I'll be able to wrap my head around this.
Sorry, I'll try to be more clear. I am just playing with Visual2A03 to see if I can completely characterize the actual behavior of the DMA 'controllers' within it. I realize there's little-to-no practical use for any of this, since most emulators are "accurate enough". I just find it interesting that we now have the tool(s) to check earlier assumptions. People might get offended by this as if I'm trying to prove them wrong--not the case!
First, to clarify what I mean when I say "DMC DMA". I mean the cycles where the 6502 is held off the bus by the DMA 'controller' within the 2A03 asserting RDY. This is distinct from the period where the DMA 'controller' itself holds off asserting RDY to the 6502 because it sees the 6502 is writing. I surmise that if the 6502 were to initiate a 15-cycle write [impossible] this 'pre RDY assertion' phase would last from when the DMA 'controller' wants the bus to the point where the 6502 goes back to reading [up to 15 cycles]. I've shown several Visual2A03 logs where, all else being held constant, an intervening 6502 write cycle is the only possible explanation for the delayed RDY. As for how many cycles the "DMC DMA" takes. That, too, is variable but not in the way we've been thinking. There definitely appears to be a DMA drumbeat that is shared between the DMA 'controllers' [not my idea...first posited
here, I believe, in a reply from ReaperSMS.
The DMA drumbeat I'm observing is opposite of what ReaperSMS posited, though:
0 - read
1 - write
2 - read
3 - write
4 - read
...
RDY assertion is delayed if the CPU is writing. Period. If the DMA drumbeat is on a read cycle when the CPU is done writing, the DMC DMA will take three cycles. If the DMA drumbeat is on a write cycle when the CPU is done writing, the DMC DMA will take four cycles.
Maybe I'm just a victim of misinterpretation -- definitely happens often enough to me. My interpretation of the DMC DMA 'idea' has always been that RDY is driven to the 6502 *then* a wait for up to three writes occurs. Since I've seen that the cycles where RDY is asserted varies from 3-4 with the last cycle being the memory read for the DMA, it cannot be the case that RDY assertion is holding off for "up to three CPU writes". That's done by the 'pre RDY assertion' phase.