stretching the Super FX

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: stretching the Super FX

Post by calima »

4 GB of RAM on cart, that gets filled on boot from a SD card/etc., geez that would bring PS1 loading times to SNES. At 100 mb/s that'd be 40 secs.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

lidnariq wrote: Tue Apr 19, 2022 10:30 pm
93143 wrote: Tue Apr 19, 2022 8:16 pmSo there's no way to get 4 GB of instantly readable memory for less than a few hundred dollars?
DRAM, probably. Not asynchronous, but latency should be low enough for this purpose.
Let me rephrase that. There's no way to get 4 GB of instantly readable non-volatile memory for less than a few hundred dollars?

Then again, this might actually be part of the answer; see below...
Programming time for modern not-SST flash is slow. At least, compared to how big they are. Spansion/Cypress/Infineon S29GLxxT flash is worst case (and not atypically) 32 bytes in 512 microseconds, or almost a day to program 4GB.
Yeah, but what causes the worst case? I'd imagine an application where the whole chip gets reprogrammed in one shot on rare occasions should see performance closer to nominal. And with the biggest available chip being 1/16 of the desired capacity, the source data could easily be interleaved to allow all 16 chips to be programmed at once. Based on the specs of the Cypress S70GL02GS, I'd expect about 12 minutes total for erasing and programming the whole thing, not far off the time it'd take to read the data off the DVD.

Still not ideal, especially if the "typical" programming times can't be relied on as you suggest. And the NOR flash price issue persists...
calima wrote: Tue Apr 19, 2022 11:47 pm4 GB of RAM on cart, that gets filled on boot from a SD card/etc., geez that would bring PS1 loading times to SNES. At 100 mb/s that'd be 40 secs.
Better than using an optical disc, especially if you don't need to load the whole thing before starting (although in a game with a save feature you might).

And if you're selling a physical cartridge, it's probably reasonable to supply the data medium yourself rather than requiring the user to buy an SD card and download the data, so the storage memory would be a known quantity. In fact you could simply include a package of eight 256Mx16 20 ns NAND flash chips in the cartridge, and get 800 MB/s for a load time of about five seconds (entirely in parallel with the SNES displaying the publisher and developer logos and loading the title screen).

It's not incredibly cheap, but it's in the ballpark. 4 GB of 1600 MHz DRAM is about $15 on Mouser (EDIT: no it's not; that's 4 Gbit. Nuts), and a 512 MB NAND flash chip is about $5.

The audio side of the MSU1 would also need storage, but it doesn't have to have sub-microsecond latency. More NAND flash?
Last edited by 93143 on Wed Apr 20, 2022 2:36 pm, edited 2 times in total.
creaothceann
Posts: 611
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: stretching the Super FX

Post by creaothceann »

Maybe there's enough space on the board to include an M.2 slot - that DRAM could already be filled and playing a startup sound when the publisher screen fades in. And a HDMI port for those who want sharp pixels. And a RJ-45 port for the netplay. And USB ports, because original controllers are getting old.

It could become the Next Unit of Computing for SNES development. :wink:
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

Not sure what you're getting at here.

I just figured that since the MSU1 is arguably a legacy add-on, but has no definitive hardware implementation, any implementation that matches the specification is fair game. And the use of a single SD card for both data and audio (the way the FXPak does it) really does make random access to moderate-sized slices of data more difficult than it needs to be.

Maybe the proper response really is to just code around the latency, and I'd have to do that anyway if I wanted the game to work on an FXPak at all. Still, I think it would be nice to be able to push the MSU1 a bit harder and have it run well on a real dedicated cartridge, even if the result is that it lags more on an FXPak.

...

This isn't actually an enabling feature for the U-ROM idea, so I guess it's a little beside the point. The two ideas simply play well together, since a fast MSU1 makes U-ROM easier to load, and U-ROM's higher capacity and more efficient GSU access vs. GPRAM makes a fast MSU1 more worthwhile.
Last edited by 93143 on Wed Apr 20, 2022 2:20 pm, edited 1 time in total.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: stretching the Super FX

Post by lidnariq »

93143 wrote: Wed Apr 20, 2022 1:12 am Let me rephrase that. There's no way to get 4 GB of instantly readable non-volatile memory for less than a few hundred dollars?
Right. As long as you're talking about SNES memory access cycle latency.

If your timing is permitted to be a few orders of magnitude slower, but still faster than typical managed NAND, then unmanaged NAND can achieve latencies in the 30-300µs range (depending on specific part).
4 GB of 1600 MHz DRAM is about $15 on Mouser
Do you mean 4 gibit? The closest I could find was two 16gibit chips for roughly $20 each). Working from commodity parts (DIMMs instead of bare chips) will be cheaper.
and a 512 MB NAND flash chip is about $5.
Note that unmanaged NAND is not guaranteed to be completely viable when purchased. You'll either have to pay a premium for "known good" grade NAND, or else add some primitive filesystem and redirection and what's programmed in will vary from PCB to PCB.
The audio side of the MSU1 would also need storage, but it doesn't have to have sub-microsecond latency. More NAND flash?
Commodity NAND would be fine here - latency less than one vsync won't be perceptible.
93143 wrote: Wed Apr 20, 2022 2:06 pm and the use of a single SD card for both data and audio (the way the FXPak does it) really does make random access to moderate-sized slices of data more difficult than it needs to be.
When I grumped about it, Near replied that he explicitly intended there be two disjoint chunks of memory for the two halves and none of this interleaving access to a fast SD card.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

lidnariq wrote: Wed Apr 20, 2022 2:09 pm
4 GB of 1600 MHz DRAM is about $15 on Mouser
Do you mean 4 gibit?
...yes. I noticed later on that that site sometimes misclassifies parts that way (where GB actually means Gb), but I didn't go back and check. Now that I have, it seems the cheapest 16 Gbit DRAMs are indeed a little over $20 in large quantities.

Anyway, it's still somewhat in the ballpark. Much better than NOR flash. And if there's a cheaper way to get 4 GB of DRAM, that's great.
Note that unmanaged NAND is not guaranteed to be completely viable when purchased. You'll either have to pay a premium for "known good" grade NAND, or else add some primitive filesystem and redirection and what's programmed in will vary from PCB to PCB.
Hmm.

SD card makers keep their prices reasonable somehow, right? Technically this component is common between the DRAM idea and the FXPak implementation; they both need enough NAND flash to hold the MSU1 data. It's just that the FXPak (understandably) shifts the cost to the user rather than including it in the product...

Then again, I've noticed that NAND flash seems to have a price bathtub that we're working in the shallow end of...
93143 wrote: Wed Apr 20, 2022 2:06 pm and the use of a single SD card for both data and audio (the way the FXPak does it) really does make random access to moderate-sized slices of data more difficult than it needs to be.
When I grumped about it, Near replied that he explicitly intended there be two disjoint chunks of memory for the two halves and none of this interleaving access to a fast SD card.
That's kinda what I thought. I guess this could be taken to mean that the FXPak implementation isn't strictly canonical, and that there's still room for an approach like the one we've been discussing to be more 'authentic'. The emphasis on low latency seems to disregard the word "streaming" in the name, but I figure it still qualifies if you can't run branched code out of it...

The initial loading isn't a problem for the standard (I think) because the Data Busy bit can simply remain set until the loading is finished. Alternately, the memory controller might be smart enough to let the SNES access the flash repository directly for uncached data, resulting in an FXPak-like situation until the DRAM is loaded. I don't know if this would add too much cost to the memory controller, or potentially too much uneven stress on the flash repository (which, if built-in, would be expected to last for decades)...

...

I suppose it would be more in keeping with the original idea to have a passthrough cartridge with some DRAM and one or two SD card slots. But I like the idea of just shoving everything into a standard-size Game Pak and having it be completely transparent to the user, the way every other add-on chip was handled. Plug and play with no extra fiddling around, and only the logo on the label tells you why you're hearing Red Book audio and seeing brute-forced sprite animation that would shame a Neo Geo.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

Here's a crazy idea.

The officially documented CPU ROM option already allows up to 6 MB mapped to banks $80-$FF in HiLoROM fashion in a Super FX cartridge. I just took a look at the S-DD1 memory map, and it looks like it fits nicely in $C0-$FF. Since the S-DD1 appears to have four banking bits, this should allow 16 MB behind the chip plus 2 MB in $80-$BF, for a total of 18 MB of CPU ROM (plus fast decompression if desired), on top of (hopefully) 4 MB of ROM and/or U-ROM for the use of the Super FX.

Now the big question is current draw. With a GSU2 (or GSU2-SP1, which might need less power), an S-DD1, 128 KB of SRAM, 128 KB of F-RAM, 3 MB of PSRAM, and 19 MB of mask ROM, on top of the base console's power requirements, I wonder if this configuration actually works... At least with the MSU1, the option exists to do it as a passthrough device with its own power supply, so it's not strictly necessary to add 4 GB of DRAM and 8 GB of NAND flash to that list...
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

Regarding pin 21, it seems to me that it should be fairly easy to test, given a Super FX devcart and the appropriate equipment. All you'd need is a program that sets bit 5 of ROMBR and repeatedly writes to R14, and a way to monitor the voltage on the pin in question. (Or perhaps it'd be better to scan through the banks, so the pin activity has context.) Am I oversimplifying it?

Is anyone in a position to actually test this? I could easily put together a program for this purpose.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: stretching the Super FX

Post by lidnariq »

I don't have a flashcart with a SFX*2* donor, but I can test on a SFX1 and ... maybe a Mario Chip? I can't check at the moment.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

Well, fullsnes labels pin 31 on the GSU1/MARIO chip as "?", and there are multiple GSU2 pins with the same label, but it's only pin 21 on the GSU2 that anyone seems to think could be "/CE for 2nd ROM chip".

Any idea what the rationale is for noting that on the GSU2's pin 21 specifically, and not on the other unknown pins?

...

Is there any chance multiple unknown pins on the GSU2 are extra ROM address or chip enable pins? The docs show ROMBR going all the way up to A23...
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: stretching the Super FX

Post by calima »

Did you check if the SFX source happened to be in one of the gigaleaks? Forbidden knowledge etc etc, but would reveal things like this.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

I didn't. It might be too late or too dangerous now.

I've seen a couple of lists of what was included, but they may be incomplete. They do mention verilog for the N64, but nothing for the Super FX.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: stretching the Super FX

Post by calima »

93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

I don't think a patent will be all that useful in finding a definitive answer to this question. The data in patents tends to be slightly incorrect on purpose, if I recall correctly, providing just enough descriptive detail to protect the actual idea while being confusing enough to prevent theft of the idea before the patent is granted. And an idea like this that can be implemented in a lot of different ways may be described one way in the patent and then actually implemented in a different way - this might not even be obfuscation, but simply the fact that the design process was still ongoing while the patent application was being drawn up.

...

Figure 1 seems to describe a chip with a 20-bit ROM address bus, consistent with pre-GSU2 models of the Super FX. This seems to clash with their claim in the description that "the program ROM typically is four megabytes", although that might refer to SNES games in general.

The pinout description in the text doesn't seem to match any known iteration of the Super FX, and suggests an 18-pin RAM address bus and a 25-pin ROM address bus. I'm not enough of a hardware guy to translate that into bits (I suspect some additional lines are considered part of the address bus here), but if the GSU2 is thought to have a 17-bit RAM bus, this suggests to me that it was intended to have a 24-bit ROM bus, which seems excessive until you realize just how many unidentified pins there are in nocash's GSU2 pinout (plus the SNES-side input address bus on the GSU2 does go up to A23)...

Then again, later in the text it states that "The ROM A bus is a 20-bit ROM address bus.", which is clearly pre-GSU2 (in accordance with Fig. 1) and thus irrelevant to the question of what the GSU2's maximum ROM size is, but does leave me wondering what the rest of P2-P26 are supposed to be for...

...

It repeatedly mentions how much longer ROM access takes than RAM access, and states that the Mario chip is designed to take advantage of this, even though their actual examples assume ROM and RAM both take 6 cycles to access (the real chip should be 3 for both or 5 for both depending on speed setting, unless I'm greatly misled).

It also states that the external clock line is intended to allow the Mario chip to operate at higher speeds than 21 MHz, and suggests that the core and memory controller could be clocked separately, with the buses on the SNES master clock and the core on its own oscillator. (This is almost the opposite of my own "clock trick", namely using a double-speed oscillator and setting the chip to slow mode, which was intended to speed up the memory buses but leave the core at 21 MHz...)

Gotta love patent legalese:
The Super NES includes within its control deck 20, a 16-bit host CPU which may, for example, be a 65816 compatible microprocessor.
This patent was filed in 1994.
M2m
Posts: 24
Joined: Mon Feb 15, 2021 12:53 pm

Re: stretching the Super FX

Post by M2m »

So basically Randy Linden says that the doom cartrige had a max of 2MB rom - but not really confirming its the max of the FX chip

https://www.youtube.com/watch?v=BIauSQ_ ... u4V4AaABAg
Post Reply