DPCM + $2007 reads

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

DPCM + $2007 reads

Post by tokumaru »

IIRC, $2007 reads are susceptible to DPCM conflicts just like $4016 and $4017, but I don't remember ever reading anything about getting around this particular variation of the DPCM bug. What would you guys say is the best way to avoid glitched $2007 reads? Simply read small portions of data at a time, retrieving each portion multiple times until the same data is read twice in a row?
Fiskbit
Posts: 891
Joined: Sat Nov 18, 2017 9:15 pm

Re: DPCM + $2007 reads

Post by Fiskbit »

I think it depends on how much data you need to read. I'm a proponent of the synced read approach where you use OAM DMA to guarantee an alignment with the DMA cadence so you can time your reads to land on safe (get) cycles. This approach is great for short functions. However, it has significant caveats if your synced code is longer than the minimum DMC DMA rate used by your software (which at the fastest rate is 432 cycles on NTSC and 400 on PAL). Specifically, long functions have to:

- align consecutive write cycles so the first following cycle is a put cycle. So, odd-cycle-length writes (single writes and interrupt stack pushes) need to land on get cycles and even-cycle-length writes (in RMW instructions) need to land on put cycles.

- handle the case where DMC DMA lands on the last or 3rd-to-last cycle of OAM DMA, causing the DMC DMA to take an odd number of cycles. This inverts which cycles are safe and unsafe, but because it's caused by a very specific DMC DMA timing, it's only a problem every every DMC-DMA-period cycles. Because it halted 4 cycles before the synced code, the first one happens DMC DMA period minus 4 cycles after the synced code begins (so on the 428th cycle, for example). If you make sure this isn't a register with read side effects, then you're good until the DMA happens again one period later, and if this cycle is an odd-length write, then it resyncs the following code so that this is no longer a problem.

Note that synced reads don't have universal emulator support. Many emulators don't support these extra reads at all, so those are fine. For those that do, DMA timing often has problems that may cause synced reads to fail. However, when using synced reads to work around joypad bit deletion, Nestopia is the only emulator I'm aware of that gets the timing wrong enough to break synced reads.
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: DPCM + $2007 reads

Post by Dwedit »

Doesn't doing a sprite DMA transfer force a particular CPU alignment? Does that mean it would be possible to pick the number of CPU cycles after a DMA to not get a bad alignment, and avoid the corruption?
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: DPCM + $2007 reads

Post by tokumaru »

While the idea of using OAM DMA is an interesting one, it looks like there are too many rules to obey and details to keep track of, and if you don't fully understand all of it (I certainly don't), it's easy to screw up, specially since emulation of this issue doesn't seem particularly accurate.

I'm almost exclusively a software guy, meaning I'm in the logic and algorithms business, so I'd personally feel much safer using a solution that detects the problem and corrects it and doesn't require so much setup and precise timing, as opposed to a solution that cleverly abuses obscure hardware quirks to avoid the problem in the first place, but is much more intricate to put together and sensitive to edits (if you ever need to change even a minute detail in the code, you'll have to count all the cycles again, validate the whole thing).

Speed is not an issue for me, so I don't mind reading data over and over. My specific situation is that I want to pause the gameplay, display in-game menus and then resume the gameplay. Since the menus will overwrite gameplay graphics in the name tables, I want to back up the original name table data so it can be restored later. Normally I'd just recreate the original name tables from scratch based on the level map (like when the level first starts), but there are some special effects that can modify the name tables on the fly, and those changes would be lost. I do have 1KB of RAM that I can temporarily use, so storage space isn't a problem either.
User avatar
gravelstudios
Posts: 159
Joined: Mon Mar 13, 2017 5:21 pm
Contact:

Re: DPCM + $2007 reads

Post by gravelstudios »

Would it be unacceptable to just disable the DMC channel for the time it takes to copy the data? That seems like the simplest solution to me.

If you really don't want to disable the DMC channel, then what I'd do is read all the data through $2007 that you want to preserve, then read it a second time and compare each byte to see if it matches. if you find two bytes that don't match, read from that address until you get two in a row that do match. It might take a little time, but if it's during a menu transition, that shouldn't be a very big deal.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: DPCM + $2007 reads

Post by lidnariq »

DPCM glitches cause byte deletions, not incorrect bytes, so it's not going to be a matter of going back and fixing the single errors - it's going to be a matter of restarting where you detected the error.
User avatar
gravelstudios
Posts: 159
Joined: Mon Mar 13, 2017 5:21 pm
Contact:

Re: DPCM + $2007 reads

Post by gravelstudios »

lidnariq wrote: Sun Mar 12, 2023 11:12 am DPCM glitches cause byte deletions, not incorrect bytes, so it's not going to be a matter of going back and fixing the single errors - it's going to be a matter of restarting where you detected the error.
Ah, I see. thanks for the clarification. Other than being a lot slower, would there be an issue with reading each byte in a loop until you get two identical reads in a row and then moving to the next byte? To do that, you couldn't rely on the PPU's internal address increment, you'd have to reset the address yourself each time you read $2007. But if you absolutely had to get the data from VRAM without disabling the DMC channel, would it work? I'm tempted to try it and see...
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: DPCM + $2007 reads

Post by Dwedit »

First question is what are you reading? Video RAM? Some piece of binary data stored in CHR-ROM?

Usually you shouldn't need to read data out of video ram. You could keep a complete copy of the attribute table in RAM, and won't need to read it out.

For binary data stored in CHR-ROM, you could just disable DMC and read it out.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: DPCM + $2007 reads

Post by tepples »

tokumaru wrote: "I want to pause the gameplay, display in-game menus and then resume the gameplay. Since the menus will overwrite gameplay graphics in the name tables, I want to back up the original name table data so it can be restored later."

A similar situation appears in Shiru's LAN Master at the end of a round. The game saves the portion of the nametable over which a success message is about to be displayed, displays a success message, and then redraws the saved portion. However, if this happens while a drum hit is being played, the saved portion is sometimes corrupt.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: DPCM + $2007 reads

Post by lidnariq »

gravelstudios wrote: Sun Mar 12, 2023 12:26 pm reading each byte in a loop until you get two identical reads in a row and then moving to the next byte? To do that, you couldn't rely on the PPU's internal address increment, you'd have to reset the address yourself each time you read $2007. But if you absolutely had to get the data from VRAM without disabling the DMC channel, would it work? I'm tempted to try it and see...
Yeah, I think it should work. Even if the two reads have the same contents, a false negative should still be valid data. As long as the read loop is fast enough only one of them could have been due to a DPCM deletion, anyway.
Fiskbit
Posts: 891
Joined: Sat Nov 18, 2017 9:15 pm

Re: DPCM + $2007 reads

Post by Fiskbit »

Reading each byte 2 or 3 times to get the correct value, with PPU address writes in between to get back to it and extra reads to refill the read buffer, is so much slower than the synced read approach that you could do batches of OAM DMA followed by 32 synced reads and come out ahead without having to worry about any of the difficult caveats of long functions. You'd be able to fit two of these in each vblank and would only have to concern yourself with making sure the reads land on get cycles (the same parity as the first cycle after OAM DMA). I've written some code below that should be able to do this.

Without synced reads, you might be able to find clever ways to make this work faster, but you'll end up with 3-byte holes in your buffer that you have to deal with, which sounds very messy or slow to me. And I don't know if you can assume the holes are 3 bytes; I wouldn't be surprised if there are emulators that emulate this glitch and only do 1 extra read. Official hardware should always be 3, though. Overall, I'm skeptical repeated reads can be done quickly here.

Code: Select all

PPU_DATA := $2007
PPU_DATA_CROSSPAGE := $1FFF
kPpuDataCrosspageIndex = PPU_DATA - PPU_DATA_CROSSPAGE

ReadPpuRow:
  ; Load our index for cross-page reads to help with sync timing.
  LDY #kPpuDataCrosspageIndex

  ; Write the address.
  LDA target_address+1
  STA PPU_ADDRESS
  LDA target_address+0
  STA PPU_ADDRESS

  ; Sync with OAM DMA.
  LDA #>oam_buffer
  STA OAM_DMA
  ; Code is now synced; for functions shorter than the shortest DMC DMA rate, the first cycle here can be assumed to be a get.
  ; This code has a synced region (from first cycle to last $2007 read cycle) of 375 cycles,
  ; which is less than the 428 (432 minus 4 from a desyncing DMA) threshold for long functions.

  ; Fill the read buffer.
  LDA PPU_DATA_CROSSPAGE,Y  ; get put get put GET

 :
  ; Do 4 reads. The $2007 read cycle must land on a get.
  LDA PPU_DATA              ; put get put GET
  STA buffer+0,X            ; put get put get put
  LDA PPU_DATA_CROSSPAGE,Y  ; get put get put GET
  STA buffer+1,X            ; put get put get put
  LDA PPU_DATA_CROSSPAGE,Y  ; get put get put GET
  STA buffer+2,X            ; put get put get put
  LDA PPU_DATA_CROSSPAGE,Y  ; get put get put GET
  STA buffer+3,X            ; put get put get put

  ; Add 4 to X.
  LDA #$FF  ; get put
  AXS #$04  ; get put

  ; Only read 32 bytes. This branch MUST NOT cross a page boundary.
  CPX #$20  ; get put
  BCC :-    ; get put get

  ; ...
This function should take about 926 cycles to safely read 32 bytes, leaving you time in vblank to do it and required setup twice while still having time for other minor vblank tasks. I would skip reading joypads when doing this, because there isn't time left in vblank to sync with OAM DMA for reading joypads, and the traditional repeated-read workaround for joypads causes bus conflicts that corrupt the sample value that is loaded to the DMC unit. Synced reads for joypads are better on hardware.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: DPCM + $2007 reads

Post by tokumaru »

gravelstudios wrote: Sun Mar 12, 2023 12:26 pmOther than being a lot slower, would there be an issue with reading each byte in a loop until you get two identical reads in a row and then moving to the next byte?
I hadn't thought of going as small as 1 byte at a time, but I guess It doesn't get any simpler than this. Thanks for the idea.
Fiskbit wrote: Sun Mar 12, 2023 4:34 pmReading each byte 2 or 3 times to get the correct value, with PPU address writes in between to get back to it and extra reads to refill the read buffer, is so much slower than the synced read approach
I know. I'm not worried about performance in this particular case, though. I did the math and even if I have to read every byte 4 times (which will absolutely not happen), it'd take less than 4 frames to read all 1024 bytes (if I'm not doing anything else). That's more than acceptable for me.
I've written some code below that should be able to do this.
Thanks. I will take this into consideration for future projects where performance is a bigger concern, but this time I'll probably stick with the simple/slow solution.
User avatar
TakuikaNinja
Posts: 89
Joined: Mon Jan 09, 2023 6:42 pm
Location: New Zealand
Contact:

Re: DPCM + $2007 reads

Post by TakuikaNinja »

Oh, so that's why EarthBound Zero (MOTHER) disables the DPCM drums while textboxes are active. It never really clicked until now.
Fiskbit
Posts: 891
Joined: Sat Nov 18, 2017 9:15 pm

Re: DPCM + $2007 reads

Post by Fiskbit »

I see, I was thinking that your pause menu was overlaying the screen such that you'd want to keep rendering on while reading. If you can turn rendering off, the performance difference is not so impactful.

Edit: That said, I still recommend synced reads for joypads to avoid the bus conflict and resulting corrupted sample byte that occurs whenever DMA collides with a joypad read and because synced reads are faster, though I recognize it does reduce emulator compatibility a little bit.
Drag
Posts: 1615
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: DPCM + $2007 reads

Post by Drag »

TakuikaNinja wrote: Sun Mar 12, 2023 8:01 pm Oh, so that's why EarthBound Zero (MOTHER) disables the DPCM drums while textboxes are active. It never really clicked until now.
Snake's Revenge does it too, and indeed, it makes sense now. :P
Post Reply