question about DMA registers

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
nocash
Posts: 1405
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: question about DMA registers

Post by nocash »

I am having two problems with HDMAs started mid-frame. Both are probably related to the "do_transfer" flag (as how it is called in the anomie docs).

The first problem is that Super Ghouls & Ghosts isn't working in current no$sns version. The anomie doc isn't too clear about if/when/how do_transfer flags are initialized for HDMAs that are started mid-frame.
I got the game working by initializing the "do_transfer" flags of all disabled channels to zero in line 0.
That way the game does work, but I am not sure if it's a correct reproduction of what the hardware does(?)

Alongsides, I've been writing a test program that does set up to HDMA channels, both with values as so:
420Ch=00h ;stop all HDMA channels
43x0h=02h ;transfer two bytes to [bbus+0], and [bbus+0]
43x1h=88h ;dummy bbus destination address (unused port 2188h)
42x4h=abus.src.bank
42x8h=abus.src.offs.lo
42x9h=abus.src.off.hi
42xAh=01h ;remain count
abus.src is pointing to "02h,77h,55h,00h" (repeat/pause 2 scanlines (02h), and transfer one data unit (77h,55h), and after the pause, finish the transfer (00h).
Then I am starting the first of the two channels in line 128, and watch the src/remain values in 43x8h..43xAh, which behave as expected (src increases, and remain decreases from 02h downto 00h).
A few scanlines later, I starting the second HDMA channel, which should do the same thing - but doesn't do so. Instead, it's decreasing remain count in 43xAh from 55h downwards... Is that a known effect?

The second channel does apparently start with "do_transfer=1" (for whatever reason), causing it to transfer 02h,77h as data, and then fetch 55h as repeat count for next scanlines.
This happens only when starting the HDMA channels one after another; it doesn't happen when I do start BOTH channels in scanline 128 (then both do decrease remain from 02h downto 00h).
Near
Founder of higan project
Posts: 1553
Joined: Mon Mar 27, 2006 5:23 pm

Re: question about DMA registers

Post by Near »

What anomie calls do_transfer is cleared at the start of every frame, regardless of whether HDMA is enabled or not. Similarly, the completed flag is cleared.

After that, -only if HDMA is enabled-, you then stop any active DMA that has HDMA enabled, copy the source address to the HDMA address, and then reload the line counter (and thus indirect address if required.) Since I'm mentioning it, keep in mind that any HDMA during DMA will kill that DMA channel mid-progress. That'll fix a few bugs for you (I think in Bugs Bunny and some football game.)

Technically this process happens at the first cycle edge at or above V=0,H=12+8-dma_counter() for CPU revision 1, and V=0,H=12+dma_counter() for CPU revision 2. dma_counter() = total clocks executed since power on & 7; (or &6; if you want.)

Or if you want, here's the implementation, I tried to follow anomie's names where possible, but I emulate some stuff he never knew about as well:
http://gitorious.org/bsnes/bsnes/blobs/ ... ma/dma.cpp
http://gitorious.org/bsnes/bsnes/blobs/ ... timing.cpp

Not to derail, but did you ever get a chance to look at the ST018? I could never figure out what that one port was that looked like a timer (it was writing 21.47MHz/3 to it.)
nocash
Posts: 1405
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: question about DMA registers

Post by nocash »

> What anomie calls do_transfer is cleared at the start of every frame,
> regardless of whether HDMA is enabled or not. Similarly, the completed flag is cleared.
Okay, then the bugfix for Ghouls & Ghosts wasn't so wrong.

> keep in mind that any HDMA during DMA will kill that DMA channel mid-progress.
> That'll fix a few bugs for you (I think in Bugs Bunny and some football game.)
Good to know, thanks! I've been a bit lazy there and hoped no game would rely on such things.

I've just first-time tested bsnes (on a winxp computer via remote desktop, hoping that it'd be able to run bsnes... it worked... but it took me an hour to figure out how to load a rom-image into it (in short: the browse file gui turned out to be misleading... it didn't do anything except occasionally crashing... but loading roms via commandline worked... at least if they are padded to min 32kbytes, else nothing happens)). And I thought no$sns was a bit old-fashioned, and the more modern emus would work by click-and-play :-)

Anyways - I got the test program running, and bsnes didn't reproduce the HDMA glitch either (ie. it didn't fetch the 55h data byte as reload/counter value). Looks as if there is at least one secret still hiding in the console.

> Not to derail, but did you ever get a chance to look at the ST018?
Don't know when I get around to extract my ARM assembler/disassembler/emulator from no$gba for use in no$sns (whenever that happens, I'll have a look at the st018 bios and ioports & let you know if I find out anything new).
Near
Founder of higan project
Posts: 1553
Joined: Mon Mar 27, 2006 5:23 pm

Re: question about DMA registers

Post by Near »

> it took me an hour to figure out how to load a rom-image into it

Don't worry, the next release is going to feature an external DLL to load files, and it'll accept files in the format you're used to.
I presume that will make koitsu very happy as well.

> at least if they are padded to min 32kbytes, else nothing happens

Not having enough ROM data to describe a reset vector isn't very fun :P

> And I thought no$sns was a bit old-fashioned, and the more modern emus would work by click-and-play :-)

I do things the more correct way, rather than the easiest way. I'm not in it for the popularity.
But yeah, the external plugin works nicely. Lets me put all the legacy code somewhere outside the emulator, which is something I can tolerate.

> Looks as if there is at least one secret still hiding in the console.

I'm skeptical, but do let me know when you figure it out, I'll confirm your findings on the five or so decks I have here.
For what it's worth, there' a lot more than one secret remaining.

CPU:
DMA/HDMA crash on CPU revision 1
If you start a DIV during a MUL, or vice versa, the results are psychotic
Initial values in WRAM are off the wall crazy; they vary per system and per reset ... approximating it to some degree would be good
On the very first frame only, the VIRQ value is off by one scanline
What is the exact timing involved in auto joypad polling? I only have an approximation because it is hard to test with manual input.
Need to wire up an Arduino to send deterministic data to it or something.

SMP:
The TEST register has two mystery bits that make the timers go crazy.

DSP:
When you mute a channel, it fades out and does not silence immediately

PPU:
We barely understand this at all. Almost no data on cycle timings here.
What if you use the MUL functionality during a Mode7 screen rendering?
What if you toggle a BGMODE in the middle of a scanline? (god help us if it actually gets acknowledged ...)

I wrote some fun tests that wrote text on the screen by changing the registers mid-scanline. But otherwise, nobody's interested in emulating or coding for the PPU like this ... I currently have the only dot-based PPU renderer, and it's as slow as you think it is.

SuperFX:
What happens if the secondary pixel cache fills up and the SFX CPU is accessing RAM? Does it interleave writes, or does one stall the other? If the CPU can stall the secondary pixel cache, you could have a RAM-only program permanently cause it to stay full.

SA-1:
Some registers can only be accessed from the SA1, some from the CPU.
The stuff no games use is not understood: H/VIRQs (how does it time the H/V position?), clock-IRQs, etc.
How in god's name do we emulate the SA-1 memory conflict controller that stalls its CPU when the SNES CPU is reading from ROM/RAM? That's going to require monstrous overhead. We can't afford to step both the CPU and SA-1 one clock tick at a time.

DSP-n/ST01n NEC DSP:
How do OV1 and S1 flags work? Documentation sucks.

Cx4 / Hitachi DSP:
How does the program RAM work? It caches pages, but what's the overhead on that? Are all opcodes one cycle each?
Currently my emulation causes Rockman to die in the intro, unless you set a different frequency rate :(
$70-77:0000-7fff always returns 0 when read. Looks like it has the option to have RAM pass through it, but MMX2/3 didn't use that.
What's with all the regs? $7f52 has to be 1 for MMX3 to read past 1MB; has to be 0 for MMX2 to read past 1MB.

ST018:
What's with the timer value thing?
Cydrak wrote a cool exploit to crush the stack and execute uploaded code out of RAM.

SPC7110:
What does $4808 and $58:0000-ffff do?
What in the hell was the intended usage of the interleave/skip functions on decompression? What are the timing restrictions there?
How do we emulate bad input data causing the decompressor to 'crash' and spit out junk?

SPC7110-RTC:
There's some interesting delays for certain actions, if you start reading too soon you get crazy stuff happening that's not in the datasheet.

SDD1:
Can DMAs happen on any channel? If not, which bits control activating decompression DMA?
How the hell does the SDD1 know when a DMA is taking place?
What's with the crazy read data when you set bank selections bigger than the ROM size?
How do we emulate this chip crashing on bad input data (like a string of 0x00s for a long time)?

SRTC:
There's a test register we know nothing about.
I haven't mapped out how the BCD works on invalid values. Took me a fucking week to do that for the SPC7110-RTC.

> Don't know when I get around to extract my ARM assembler/disassembler/emulator from no$gba for use in no$sns

Yours may be a bit trickier because I bet you share ARMv4/v5 for your NDS emulation; but I use the same CPU core for the GBA and ST018 (ARMv3), and it works fine. I need to make a separate instruction table to omit the v4-only stuff though.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: question about DMA registers

Post by tepples »

byuu wrote:Not having enough ROM data to describe a reset vector isn't very fun
It doesn't take 32K just to describe a reset vector, at least on the Super NES's predecessor. A 16K ROM on the NROM-128 board is mirrored into both $8000-$BFFF and $C000-$FFFF. One game is so small it's mapped into $8000-$9FFF, $A000-$BFFF, $C000-$DFFF, and $E000-$FFFF, and the most common file format for NES games just accepts that as an overdump. It's made official as of NES 2.0: "double it up and call it a day" says kevtris.
byuu wrote:But yeah, the external plugin works nicely. Lets me put all the legacy code somewhere outside the emulator, which is something I can tolerate.
Is it flexible enough to allow multiple games in one file, such as for a plug-in that can load ROMs out of a .zip or .7z?
byuu wrote:I wrote some fun tests that wrote text on the screen by changing the registers mid-scanline. But otherwise, nobody's interested in emulating or coding for the PPU like this ... I currently have the only dot-based PPU renderer, and it's as slow as you think it is.
I wonder how hard it'd be to port something like blargg's flowing palette demo to the Super NES.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: question about DMA registers

Post by psycopathicteen »

PPU:
We barely understand this at all. Almost no data on cycle timings here.
What if you use the MUL functionality during a Mode7 screen rendering?
What if you toggle a BGMODE in the middle of a scanline? (god help us if it actually gets acknowledged ...)
I've been wondering lately what happens when you change the Mode7 parameters midscreen. The best case senario would be a split-screen Mode-7 effect, which can be useful for a giant front facing angel boss with rotating wings.

For changing BGMODE mid scanline, the best case senario would be a splite-screen effect, but with a garbage tile inbetween. If that is the case, then maybe you can disable the BG layer right before the mode switch, switch to mode-7, enable the BG layer again, Let the Mode-7 layer rotate an object, disable the BG layer, switch modes, enable it again, to continue the background layer, and then use sprites to patch up the rectangular hole in the background layer. This way you can have a BG layer and a Mode-7 layer, without creating the entire background out of sprites.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: question about DMA registers

Post by Bregalad »

This would be incredibly cool, but I think the probability such a thing can be done in hardware is like 0.001%

I think it would remain in the current mode until the next scanline.

This should really be tested though.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: question about DMA registers

Post by tepples »

Bregalad wrote:I think it would remain in the current mode until the next scanline.
Wouldn't that violate the hardware engineering rule of thumb to use the minimum number of flip-flops, such as those holding the current mode?
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: question about DMA registers

Post by Bregalad »

Well, I don't know. In some cases one or two flip-flops could saves several dozen of gates.
The NES PPU is very "combinational" in it's design, however the C64 chip is exactly the opposite, it is very "sequential" as you can fool it into thinking it has done something when it haven't, or vice versa (you can focre it to re-fetch color table etc..., while the NES fetches colours every scanline)

Anyways we're not going to re-invent the SNES PPU and this should be tested.
nocash
Posts: 1405
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: question about DMA registers

Post by nocash »

> DMA/HDMA crash on CPU revision 1
I don't think that my SNES is having that revision... or is revision1 the first revision (version2) after the original version (version1)? My bigger concern is that we don't seem to know which CPU (and PPU) versions do exist at all. That tiny detail should be reverse-engineered before talking about smarter differences between revisions. I know that my SNES is having this chipset (and ID values in right column):

Code: Select all

 Board:     (C) 1992 Nintendo, SNSP-CPU-01        ;BOARD
 U1  100pin Nintendo, S-CPU A, 5A22-02, 2FF 7S    ;CPU  ;ID=2 in 4210h
 U2  100pin Nintendo, S-PPU1, 5C77-01, 2EU 64     ;PPU1 ;ID=1 in 213Eh
 U3  100pin Nintendo, S-PPU2 B, 5C78-03, 2EV 7G   ;PPU2 ;ID=3 in 213Fh
There seem to be different chip versions, and even a cost-down version with CPU+PPUs in one chip. The thing I'd like to see would be chip names & ID values typed-up from such consoles (or without typing-up: photos of the mainboard, bundled with screenshots of the ID values).

> PPU: What if you use the MUL functionality during a Mode7 screen rendering?
MUL writes are just treated as rotation/scaling parameters, and MUL reads work as so: http://nocash.emubase.de/fullsnes.htm#s ... ionscaling (in the "M7A/M7B Port Notes" section).
The funny thing is that the PPU is actually doing 680 insane-fast multiplications per scanline (normally one would need 8 multiplications for the first pixel, and then add horizontal offsets for the following pixels).
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: question about DMA registers

Post by tepples »

nocash wrote:My bigger concern is that we don't seem to know which CPU (and PPU) versions do exist at all. That tiny detail should be reverse-engineered before talking about smarter differences between revisions.
In The Lion King, if you push B Button, A Button, R Button, R Button, Y Button (BARRY), it'll tell you what the IDs are.
The funny thing is that the PPU is actually doing 680 insane-fast multiplications per scanline (normally one would need 8 multiplications for the first pixel, and then add horizontal offsets for the following pixels).
Patent workaround perhaps?
Near
Founder of higan project
Posts: 1553
Joined: Mon Mar 27, 2006 5:23 pm

Re: question about DMA registers

Post by Near »

> It doesn't take 32K just to describe a reset vector, at least on the Super NES's predecessor.

Sure, and technically if you use a board manifest with my emulator you can do it with less than 32K as well. You can even map RAM there instead.

But the heuristics are a cheap hack to get commercial software working, and all commercial software either has 32K or 64K banks, so nobody scans for headers at 16K, etc.

My guess is that nocash is assuming a reset vector of $8000 if a ROM is <32K, which isn't how the hardware would work, but I guess is nice if you don't want to pad the test ROM for some reason.

> Is it flexible enough to allow multiple games in one file, such as for a plug-in that can load ROMs out of a .zip or .7z?

Yes, I used to do that too with the old bsnes/Qt version of this concept (snesloader)
It was fun eating up 600MB of RAM and ten seconds to load Super Mario World from the GoodMerge set (997 hacks in one archive. Not exaggerating that number, it's an exact value.) [and that speed/RAM usage was with the official 7zip library code, as used in fex.] {by the way, fex is fucking fantastic if you've never tried it.}
A lot of people loved that feature, too.

> I wonder how hard it'd be to port something like blargg's flowing palette demo to the Super NES.

Due to the DRAM refresh in the middle of each rendering scanline, it would have a sharp bar of solid colors for ~10-15 pixels. But aside from that, it would only be nominally harder (have to add in variable memory access speed and penalty cycles.) Good news is that it should work in bsnes/accuracy, too.

I used the display brightness register to write text with my demo ROMs. It was nowhere near as visually pleasing as blargg's example.

> I think it would remain in the current mode until the next scanline.

Likewise. Or worse, it will be like turning the display on and off. It'll just fuck the graphics up royally for several pixels, and then recover.

> U1 100pin Nintendo, S-CPU A, 5A22-02, 2FF 7S ;CPU ;ID=2 in 4210h

That's a revision 2. Your CPU is immune to DMA/HDMA crash.

> My bigger concern is that we don't seem to know which CPU (and PPU) versions do exist at all.

CPU has revision 1 & 2.
PPU1 has revision 1.
PPU2 has revision 1, 2 & 3. Revision 2 is hauntingly rare.

Known models:
CPU/PPU1/PPU2
1/1/1 (uncommon)
2/1/1 (rarest)
2/1/2 (really rare)
2/1/3 (common as dirt)

Once Nintendo moved to the one-chip design, they stopped updating the revisions, but still changed things.
The SNES Jr, for instance, has different SMP timer behavior (no glitching), and the PPU mid scanline effects still work, but seem to not work as well? Like, you lower the brightness from max to full black, yet you see onscreen a light gray color. WRAM/APURAM initialization patterns are totally different each time, too.

It's my personal opinion that the SNES Jr is an official clone (redesign) of the original system.

I've not found any differences between the PPU2 revisions. All the bugs I know of (X=256 priority issue, half-height on OAM size 6 interlace mode, EXTBG BG2 using the wrong scroll offset in one direction, etc) still exist.

> The funny thing is that the PPU is actually doing 680 insane-fast multiplications per scanline (normally one would need 8 multiplications for the first pixel, and then add horizontal offsets for the following pixels).

Wow, so they do reload and remultiply for every pixel? In that case, you could change the values mid-scanline.

> Patent workaround perhaps?

'604: "A method for adding to a number after having multipled it."

Sadly, I could see the US patent office granting that.
nocash
Posts: 1405
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: question about DMA registers

Post by nocash »

> My guess is that nocash is assuming a reset vector of $8000 if a ROM is <32K
No, at FFFC, as usually. With ROMs of 1K,2k,4K,8K,16K size being mirrored within the 32K area. I thought that'd be obvious. Now I am wondering if there has ever been something like a 1K-compo for the SNES.

> the PPU mid scanline effects still work, but seem to not work as well?
> Like, you lower the brightness from max to full black, yet you see onscreen a light gray color.
You mean [2100h]=00h doesn't act as black, and not as dark-gray? And instead it does produce light gray?
Or, that there are a few gray pixels displayed at time of writing any value to 2100h?

> Known models:
> CPU/PPU1/PPU2
> 1/1/1 (uncommon)
> 2/1/1 (rarest)
> 2/1/2 (really rare)
> 2/1/3 (common as dirt)
Okay, and the newer stuff including cost-down single-chip version returns 2/1/3, too? Then I'll try a guess:

CPU Versions
CPU.ID=1 100pin Nintendo, S-CPU, 5A22-01 (CPU) http://www.chipdb.org/img-nintendo-s-cpu-snes-5274.htm
CPU.ID=2 100pin Nintendo, S-CPU A, 5A22-02 (CPU) as found in my own SNES
CPU.ID=2 100pin Nintendo, S-CPU B, 5A22-02 (CPU) http://www.snescentral.com/article.php?id=1017
CPU.ID=2 160pin Nintendo, S-CPUN A, RF5A122 (CPU, PPU1, PPU2, S-CLK)

PPU1 Versions
PPU1.ID=1 100pin Nintendo, S-PPU1, 5C77-01 (PPU1) as found in my own SNES
PPU1.ID=1 160pin Nintendo, S-CPUN A, RF5A122 (CPU, PPU1, PPU2, S-CLK)

PPU2 Versions
PPU2.ID=1 100pin Nintendo, S-PPU2?, 5C78-01? (rarely mentioned in internet)
PPU2.ID=2 100pin Nintendo, S-PPU2 A?, 5C78-02?? (never mentioned in internet)
PPU2.ID=3 100pin Nintendo, S-PPU2 B, 5C78-03 (PPU2) as found in my own SNES
PPU2.ID=3 100pin Nintendo, S-PPU2 C, 5C78-03 (PPU2) http://www.snescentral.com/article.php?id=1017
PPU2.ID=3 160pin Nintendo, S-CPUN A, RF5A122 (CPU, PPU1, PPU2, S-CLK)

Could that be correct?
The "S-CPUN A" chip name suggests that there might have also been a "S-CPUN" (without "A")?
Post Reply