Page 1 of 1

DMC DMA causing double-$2002 reads?

Posted: Wed Jun 20, 2012 2:59 pm
by cpow
I was just dorking around more with Visual2A03 [my favorite topic lately], trying to completely understand the DMC DMA and its impact. I recognize that there's lots of talk about how a DMC DMA can cause an extra controller read. I happened to be just randomly trying to find out how many cycles RDY is held in different situations, and created the following program.

The output of this surprised me a bit, and got me thinking...DMC DMA could maybe cause skipped frames.

Here's the relevant portion of the log:

Code: Select all

cycle	ab	db	rw	Fetch	pc	a	x	y	s	p	c_rdy
34	4015	40	0		0015	10	c0	00	bd	nv‑Bdizc	1
34	4015	10	0		0015	10	c0	00	bd	nv‑Bdizc	1
35	0015	ad	1	LDA Abs	0015	10	c0	00	bd	nv‑Bdizc	1
35	0015	ad	1	LDA Abs	0015	10	c0	00	bd	nv‑Bdizc	1
36	0016	02	1		0016	10	c0	00	bd	nv‑Bdizc	1
36	0016	02	1		0016	10	c0	00	bd	nv‑Bdizc	1
37	0017	20	1		0017	10	c0	00	bd	nv‑Bdizc	1
37	0017	20	1		0017	10	c0	00	bd	nv‑Bdizc	1
38	2002	00	1		0018	10	c0	00	bd	nv‑Bdizc	1
38	2002	00	1		0018	10	c0	00	bd	nv‑Bdizc	0
39	2002	00	1		0018	00	c0	00	bd	nv‑BdiZc	0
39	2002	00	1		0018	00	c0	00	bd	nv‑BdiZc	0
40	c000	e0	1		0018	00	c0	00	bd	nv‑BdiZc	0
40	c000	e0	1		0018	00	c0	00	bd	nv‑BdiZc	0
41	2002	00	1		0018	e0	c0	00	bd	Nv‑Bdizc	1
41	2002	00	1		0018	e0	c0	00	bd	Nv‑Bdizc	1
42	0018	10	1	BPL 	0018	00	c0	00	bd	nv‑BdiZc	1
42	0018	10	1	BPL 	0018	00	c0	00	bd	nv‑BdiZc	1
At cycle 38, read $2002 has gone on the bus and RDY is asserted. Then in cycle 40 the DMC DMA happens. Then cycle 41, read $2002 goes back on the bus.

The pattern above is identical regardless of the address being read, obviously...so it makes me think that the above pattern is representative of the $4016/$4017 extra reads.

Posted: Wed Jun 20, 2012 3:10 pm
by Disch
AFAIK the DMC read can affect every register read, including $2002, $2007, $4015, etc.

With $2002 it doesn't really matter, because reading that to check for VBlank is already unreliable, even without the DMC as a factor.

Posted: Wed Jun 20, 2012 8:24 pm
by Drag
Out of curiosity, does the DMC affect writes to 2007, or just reads? I think I still don't quite understand what exactly is going on.

Posted: Wed Jun 20, 2012 8:35 pm
by Dwedit
I'm guessing no, otherwise you'd see glitches all over the place in any game that uses DMC.
Also that DMC takes one cycle less when it falls on a write.

Posted: Wed Jun 20, 2012 9:26 pm
by Disch
Writes are unaffected.

I swear blargg went over all this in a post. Let me see if I can find it.

EDIT:

Here: http://nesdev.com/bbs/viewtopic.php?p=33564#33564

Posted: Wed Jun 20, 2012 9:33 pm
by cpow
Dwedit wrote:I'm guessing no, otherwise you'd see glitches all over the place in any game that uses DMC.
Also that DMC takes one cycle less when it falls on a write.
I've been doing a lot of looking at writes...and never bothered to try to explain what I'm seeing. Here goes.

Here's a program that does everything exactly the same as the previous program in this thread except the LDA $2002 is now a STA $2002. [I know...I know...that's an invalid NES thing to do but this isn't a NES it's just a 2A03.]

Relevant portion of the log:

Code: Select all

34	4015	40	0		0015	10	c0	00	bd	nv‑Bdizc	1
34	4015	10	0		0015	10	c0	00	bd	nv‑Bdizc	1
35	0015	8d	1	STA Abs	0015	10	c0	00	bd	nv‑Bdizc	1
35	0015	8d	1	STA Abs	0015	10	c0	00	bd	nv‑Bdizc	1
36	0016	02	1		0016	10	c0	00	bd	nv‑Bdizc	1
36	0016	02	1		0016	10	c0	00	bd	nv‑Bdizc	1
37	0017	20	1		0017	10	c0	00	bd	nv‑Bdizc	1
37	0017	20	1		0017	10	c0	00	bd	nv‑Bdizc	1
38	2002	20	0		0018	10	c0	00	bd	nv‑Bdizc	1
38	2002	10	0		0018	10	c0	00	bd	nv‑Bdizc	1
39	0018	10	1	BPL 	0018	10	c0	00	bd	nv‑Bdizc	1
39	0018	10	1	BPL 	0018	10	c0	00	bd	nv‑Bdizc	0
40	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
40	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
41	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
41	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
42	c000	e0	1	CPX #	0019	10	c0	00	bd	nv‑Bdizc	0
42	c000	e0	1	CPX #	0019	10	c0	00	bd	nv‑Bdizc	0
43	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	1
43	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	1
44	0019	fe	1		0019	10	c0	00	bd	nv‑Bdizc	1
44	0019	fe	1		0019	10	c0	00	bd	nv‑Bdizc	1
Not only does the DMC DMA not interrupt the write, it is *stalled* beyond the write. In the previous example using LDA $2002 the DMC DMA occurs between cycles 38 and 40. In the STA $2002 case above we can see RDY is not pulled low in cycle 38 as in the LDA $2002 case. Cycle 38 is a write. It is pulled low in cycle 39 which is a read. Further, this DMA takes four total cycles [counting from the assertion of RDY in 39 to the completion of the DMA in 42]. This too is contrary to the stated "DMC DMA takes one cycle less on a write". In the LDA $2002 example the DMA starts in cycle 38 and completes in cycle 40 -- three cycles. DMC DMA appears to take one cycle less in a read. I've seen similar results with trying to cause DMC DMA during an interrupt sequence. If the DMA would otherwise occur during one of the stack-write cycles, it is delayed until those write cycles complete then RDY is pulled low during the vector fetch. More to come...

Posted: Wed Jun 20, 2012 9:53 pm
by cpow
cpow wrote:More to come...
This program shows a DMC DMA occurring during the vector fetch phase of an IRQ.

Relevant log parts:

Code: Select all

37	01bd	10	0		0015	10	c0	00	bd	nv‑bdizc	1	0
37	01bd	00	0		0015	10	c0	00	bd	nv‑bdizc	1	0
38	01bc	15	0		0015	10	c0	00	bd	nv‑bdizc	1	0
38	01bc	15	0		0015	10	c0	00	bd	nv‑bdizc	1	0
39	01bb	15	0		0015	10	c0	00	bd	nv‑bdizc	1	0
39	01bb	20	0		0015	10	c0	00	bd	nv‑bdizc	1	0
40	fffe	40	1		0015	10	c0	00	ba	nv‑bdizc	1	0
40	fffe	40	1		0015	10	c0	00	ba	nv‑bdizc	0	0
41	fffe	40	1		0015	10	c0	00	ba	nv‑bdizc	0	0
41	fffe	40	1		0015	10	c0	00	ba	nv‑bdizc	0	0
42	c000	e0	1		0015	10	c0	00	ba	nv‑bdizc	0	0
42	c000	e0	1		0015	10	c0	00	ba	nv‑bdizc	0	0
43	fffe	40	1		0015	10	c0	00	ba	nv‑bdizc	1	0
43	fffe	40	1		0015	10	c0	00	ba	nv‑Bdizc	1	0
44	ffff	00	1		0015	10	c0	00	ba	nv‑BdIzc	1	0
44	ffff	00	1		0015	10	c0	00	ba	nv‑BdIzc	1 0
Here the RDY assertion is delayed again. Following the LDA $2002 example I would expect RDY to assert in cycle 38 [nothing about the DMC setup has changed, DMC DMA occurs between cycles 38-40 if 38 is a read cycle]. Instead, it's as if the 2A03 has additional logic that does its own "RDY processing" -- meaning it won't even bother the CPU during a write cycle because it knows the CPU won't halt, so it waits until the next read cycle, then asserts RDY and starts the 3-cycle DMA.

Posted: Wed Jun 20, 2012 10:06 pm
by cpow
I think the conclusion to draw is that the actual DMC DMA -- the memory read -- will always occur coincident with where a sprite DMA memory read cycle would occur if sprite DMA were also occurring. [Someone mentioned this possibility in another thread that I'm not finding...] That has been shown in the Visual2A03 program traces where I interrupted sprite DMA with one or two DMC DMAs. The memory read cycle of the read/write sprite DMA beats is where the DMC memory read occurs. So, the "whether a DMC DMA takes 3 or 4 cycles when not also during sprite DMA" is most likely because of this synchronization, not at all because of the read/write state of the CPU's external bus. That state, it seems, drives another hold-off circuit that prevents even the assertion of RDY.

Posted: Thu Jun 21, 2012 7:45 am
by cpow
If anyone else finds this interesting...it appears that DMC DMA always takes 3 cycles if the RDY occurs coincident with a CPU memory-read [cases 1 and 2 below]. If the RDY would occur coincident with a CPU memory write, it is held off until the next CPU read cycle. If it's held off one CPU write cycle that results in a 4-cycle DMC DMA [case 3 below]. If it's held off for two CPU write cycles that results in a 3-cycle DMC DMA [case 4 below]. Maintaining synchronization with the 2-cycle DMA read/write beat.

1. DMC started at cycle 36. 3-cycle DMC DMA (cycles 40-42) occurs. This is the "control" because there's nothing going on to hinder when the DMC DMA could occur -- no writes, no sprite DMA.

Code: Select all

36	4015	40	0		0016	10	c0	00	bd	nv‑Bdizc	1
36	4015	10	0		0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	ad	1	LDA Abs	0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	ad	1	LDA Abs	0016	10	c0	00	bd	nv‑Bdizc	1
38	0017	02	1		0017	10	c0	00	bd	nv‑Bdizc	1
38	0017	02	1		0017	10	c0	00	bd	nv‑Bdizc	1
39	0018	20	1		0018	10	c0	00	bd	nv‑Bdizc	1
39	0018	20	1		0018	10	c0	00	bd	nv‑Bdizc	1
40	2002	00	1		0019	10	c0	00	bd	nv‑Bdizc	1
40	2002	00	1		0019	10	c0	00	bd	nv‑Bdizc	0
41	2002	00	1		0019	00	c0	00	bd	nv‑BdiZc	0
41	2002	00	1		0019	00	c0	00	bd	nv‑BdiZc	0
42	c000	e0	1		0019	00	c0	00	bd	nv‑BdiZc	0
42	c000	e0	1		0019	00	c0	00	bd	nv‑BdiZc	0

2. DMC started at cycle 37. 3-cycle DMC DMA (cycles 40-42) occurs. This is similar to the "control" case above except the DMC is started one cycle later. I believe this shows the 2-beat synchronization of the sprite/DMC DMA engines, with reads occurring on even cycles, writes on odd cycles [sprite DMA only].

Code: Select all

37	4015	40	0		0017	10	c0	00	bd	nv‑Bdizc	1
37	4015	10	0		0017	10	c0	00	bd	nv‑Bdizc	1
38	0017	ad	1	LDA Abs	0017	10	c0	00	bd	nv‑Bdizc	1
38	0017	ad	1	LDA Abs	0017	10	c0	00	bd	nv‑Bdizc	1
39	0018	02	1		0018	10	c0	00	bd	nv‑Bdizc	1
39	0018	02	1		0018	10	c0	00	bd	nv‑Bdizc	1
40	0019	20	1		0019	10	c0	00	bd	nv‑Bdizc	1
40	0019	20	1		0019	10	c0	00	bd	nv‑Bdizc	0
41	0019	20	1		001a	10	c0	00	bd	nv‑Bdizc	0
41	0019	20	1		001a	10	c0	00	bd	nv‑Bdizc	0
42	c000	e0	1		001a	10	c0	00	bd	nv‑Bdizc	0
42	c000	e0	1		001a	10	c0	00	bd	nv‑Bdizc	0
3. DMC started at cycle 36. 4-cycle DMC DMA (cycles 41-44) immediately following a memory write (cycle 40) by the CPU. I believe this is a write that "gets in the way of" the DMA/RDY because it occurs the cycle where the RDY assertion occurs in the write case above.

Code: Select all

36	4015	40	0		0016	10	c0	00	bd	nv‑Bdizc	1
36	4015	10	0		0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	95	1	STA zp,X	0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	95	1	STA zp,X	0016	10	c0	00	bd	nv‑Bdizc	1
38	0017	81	1		0017	10	c0	00	bd	nv‑Bdizc	1
38	0017	81	1		0017	10	c0	00	bd	nv‑Bdizc	1
39	0081	00	1		0018	10	c0	00	bd	nv‑Bdizc	1
39	0081	00	1		0018	10	c0	00	bd	nv‑Bdizc	1
40	0041	00	0		0018	10	c0	00	bd	nv‑Bdizc	1
40	0041	10	0		0018	10	c0	00	bd	nv‑Bdizc	1
41	0018	10	1	BPL 	0018	10	c0	00	bd	nv‑Bdizc	1
41	0018	10	1	BPL 	0018	10	c0	00	bd	nv‑Bdizc	0
42	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
42	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
43	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
43	0018	10	1	BPL 	0019	10	c0	00	bd	nv‑Bdizc	0
44	c000	e0	1	CPX #	0019	10	c0	00	bd	nv‑Bdizc	0
44	c000	e0	1	CPX #	0019	10	c0	00	bd	nv‑Bdizc	0
4. DMC started at cycle 36. 3-cycle DMC DMA (cycles 42-44) immediately following a dual memory write (cycles 40 and 41) by the CPU. These two writes delay RDY assertion but do not result in a 4-cycle DMA.

Code: Select all

36	4015	40	0		0016	10	c0	00	bd	nv‑Bdizc	1
36	4015	10	0		0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	c6	1	DEC zp	0016	10	c0	00	bd	nv‑Bdizc	1
37	0016	c6	1	DEC zp	0016	10	c0	00	bd	nv‑Bdizc	1
38	0017	80	1		0017	10	c0	00	bd	nv‑Bdizc	1
38	0017	80	1		0017	10	c0	00	bd	nv‑Bdizc	1
39	0080	00	1		0018	10	c0	00	bd	nv‑Bdizc	1
39	0080	00	1		0018	10	c0	00	bd	nv‑Bdizc	1
40	0080	00	0		0018	10	c0	00	bd	nv‑Bdizc	1
40	0080	00	0		0018	10	c0	00	bd	nv‑Bdizc	1
41	0080	55	0		0018	10	c0	00	bd	Nv‑Bdizc	1
41	0080	ff	0		0018	10	c0	00	bd	Nv‑Bdizc	1
42	0018	10	1	BPL 	0018	10	c0	00	bd	Nv‑Bdizc	1
42	0018	10	1	BPL 	0018	10	c0	00	bd	Nv‑Bdizc	0
43	0018	10	1	BPL 	0019	10	c0	00	bd	Nv‑Bdizc	0
43	0018	10	1	BPL 	0019	10	c0	00	bd	Nv‑Bdizc	0
44	c000	e0	1	CPX #	0019	10	c0	00	bd	Nv‑Bdizc	0
44	c000	e0	1	CPX #	0019	10	c0	00	bd	Nv‑Bdizc	0

Posted: Thu Jun 21, 2012 8:22 am
by Nessie
cpow wrote:If anyone else finds this interesting...
Just want to say I find this interesting but can't post any meaningful response because I'm having trouble following what's actually going on :)
Have to do some testing myself with blargg's framework before I'll be able to wrap my head around this.

Posted: Thu Jun 21, 2012 10:33 am
by cpow
Nessie wrote:
cpow wrote:If anyone else finds this interesting...
Just want to say I find this interesting but can't post any meaningful response because I'm having trouble following what's actually going on :)
Have to do some testing myself with blargg's framework before I'll be able to wrap my head around this.
Sorry, I'll try to be more clear. I am just playing with Visual2A03 to see if I can completely characterize the actual behavior of the DMA 'controllers' within it. I realize there's little-to-no practical use for any of this, since most emulators are "accurate enough". I just find it interesting that we now have the tool(s) to check earlier assumptions. People might get offended by this as if I'm trying to prove them wrong--not the case!

First, to clarify what I mean when I say "DMC DMA". I mean the cycles where the 6502 is held off the bus by the DMA 'controller' within the 2A03 asserting RDY. This is distinct from the period where the DMA 'controller' itself holds off asserting RDY to the 6502 because it sees the 6502 is writing. I surmise that if the 6502 were to initiate a 15-cycle write [impossible] this 'pre RDY assertion' phase would last from when the DMA 'controller' wants the bus to the point where the 6502 goes back to reading [up to 15 cycles]. I've shown several Visual2A03 logs where, all else being held constant, an intervening 6502 write cycle is the only possible explanation for the delayed RDY. As for how many cycles the "DMC DMA" takes. That, too, is variable but not in the way we've been thinking. There definitely appears to be a DMA drumbeat that is shared between the DMA 'controllers' [not my idea...first posited here, I believe, in a reply from ReaperSMS.

The DMA drumbeat I'm observing is opposite of what ReaperSMS posited, though:

0 - read
1 - write
2 - read
3 - write
4 - read
...

RDY assertion is delayed if the CPU is writing. Period. If the DMA drumbeat is on a read cycle when the CPU is done writing, the DMC DMA will take three cycles. If the DMA drumbeat is on a write cycle when the CPU is done writing, the DMC DMA will take four cycles.

Maybe I'm just a victim of misinterpretation -- definitely happens often enough to me. My interpretation of the DMC DMA 'idea' has always been that RDY is driven to the 6502 *then* a wait for up to three writes occurs. Since I've seen that the cycles where RDY is asserted varies from 3-4 with the last cycle being the memory read for the DMA, it cannot be the case that RDY assertion is holding off for "up to three CPU writes". That's done by the 'pre RDY assertion' phase.