stretching the Super FX
Moderator: Moderators
Forum rules
- For making cartridges of your Super NES games, see Reproduction.
Re: stretching the Super FX
4 GB of RAM on cart, that gets filled on boot from a SD card/etc., geez that would bring PS1 loading times to SNES. At 100 mb/s that'd be 40 secs.
Re: stretching the Super FX
Let me rephrase that. There's no way to get 4 GB of instantly readable non-volatile memory for less than a few hundred dollars?
Then again, this might actually be part of the answer; see below...
Yeah, but what causes the worst case? I'd imagine an application where the whole chip gets reprogrammed in one shot on rare occasions should see performance closer to nominal. And with the biggest available chip being 1/16 of the desired capacity, the source data could easily be interleaved to allow all 16 chips to be programmed at once. Based on the specs of the Cypress S70GL02GS, I'd expect about 12 minutes total for erasing and programming the whole thing, not far off the time it'd take to read the data off the DVD.Programming time for modern not-SST flash is slow. At least, compared to how big they are. Spansion/Cypress/Infineon S29GLxxT flash is worst case (and not atypically) 32 bytes in 512 microseconds, or almost a day to program 4GB.
Still not ideal, especially if the "typical" programming times can't be relied on as you suggest. And the NOR flash price issue persists...
Better than using an optical disc, especially if you don't need to load the whole thing before starting (although in a game with a save feature you might).
And if you're selling a physical cartridge, it's probably reasonable to supply the data medium yourself rather than requiring the user to buy an SD card and download the data, so the storage memory would be a known quantity. In fact you could simply include a package of eight 256Mx16 20 ns NAND flash chips in the cartridge, and get 800 MB/s for a load time of about five seconds (entirely in parallel with the SNES displaying the publisher and developer logos and loading the title screen).
It's not incredibly cheap, but it's in the ballpark. 4 GB of 1600 MHz DRAM is about $15 on Mouser (EDIT: no it's not; that's 4 Gbit. Nuts), and a 512 MB NAND flash chip is about $5.
The audio side of the MSU1 would also need storage, but it doesn't have to have sub-microsecond latency. More NAND flash?
Last edited by 93143 on Wed Apr 20, 2022 2:36 pm, edited 2 times in total.
-
- Posts: 611
- Joined: Mon Jan 23, 2006 7:47 am
- Location: Germany
- Contact:
Re: stretching the Super FX
Maybe there's enough space on the board to include an M.2 slot - that DRAM could already be filled and playing a startup sound when the publisher screen fades in. And a HDMI port for those who want sharp pixels. And a RJ-45 port for the netplay. And USB ports, because original controllers are getting old.
It could become the Next Unit of Computing for SNES development.
It could become the Next Unit of Computing for SNES development.
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10
Re: stretching the Super FX
Not sure what you're getting at here.
I just figured that since the MSU1 is arguably a legacy add-on, but has no definitive hardware implementation, any implementation that matches the specification is fair game. And the use of a single SD card for both data and audio (the way the FXPak does it) really does make random access to moderate-sized slices of data more difficult than it needs to be.
Maybe the proper response really is to just code around the latency, and I'd have to do that anyway if I wanted the game to work on an FXPak at all. Still, I think it would be nice to be able to push the MSU1 a bit harder and have it run well on a real dedicated cartridge, even if the result is that it lags more on an FXPak.
...
This isn't actually an enabling feature for the U-ROM idea, so I guess it's a little beside the point. The two ideas simply play well together, since a fast MSU1 makes U-ROM easier to load, and U-ROM's higher capacity and more efficient GSU access vs. GPRAM makes a fast MSU1 more worthwhile.
I just figured that since the MSU1 is arguably a legacy add-on, but has no definitive hardware implementation, any implementation that matches the specification is fair game. And the use of a single SD card for both data and audio (the way the FXPak does it) really does make random access to moderate-sized slices of data more difficult than it needs to be.
Maybe the proper response really is to just code around the latency, and I'd have to do that anyway if I wanted the game to work on an FXPak at all. Still, I think it would be nice to be able to push the MSU1 a bit harder and have it run well on a real dedicated cartridge, even if the result is that it lags more on an FXPak.
...
This isn't actually an enabling feature for the U-ROM idea, so I guess it's a little beside the point. The two ideas simply play well together, since a fast MSU1 makes U-ROM easier to load, and U-ROM's higher capacity and more efficient GSU access vs. GPRAM makes a fast MSU1 more worthwhile.
Last edited by 93143 on Wed Apr 20, 2022 2:20 pm, edited 1 time in total.
Re: stretching the Super FX
Right. As long as you're talking about SNES memory access cycle latency.
If your timing is permitted to be a few orders of magnitude slower, but still faster than typical managed NAND, then unmanaged NAND can achieve latencies in the 30-300µs range (depending on specific part).
Do you mean 4 gibit? The closest I could find was two 16gibit chips for roughly $20 each). Working from commodity parts (DIMMs instead of bare chips) will be cheaper.4 GB of 1600 MHz DRAM is about $15 on Mouser
Note that unmanaged NAND is not guaranteed to be completely viable when purchased. You'll either have to pay a premium for "known good" grade NAND, or else add some primitive filesystem and redirection and what's programmed in will vary from PCB to PCB.and a 512 MB NAND flash chip is about $5.
Commodity NAND would be fine here - latency less than one vsync won't be perceptible.The audio side of the MSU1 would also need storage, but it doesn't have to have sub-microsecond latency. More NAND flash?
When I grumped about it, Near replied that he explicitly intended there be two disjoint chunks of memory for the two halves and none of this interleaving access to a fast SD card.
Re: stretching the Super FX
...yes. I noticed later on that that site sometimes misclassifies parts that way (where GB actually means Gb), but I didn't go back and check. Now that I have, it seems the cheapest 16 Gbit DRAMs are indeed a little over $20 in large quantities.
Anyway, it's still somewhat in the ballpark. Much better than NOR flash. And if there's a cheaper way to get 4 GB of DRAM, that's great.
Hmm.Note that unmanaged NAND is not guaranteed to be completely viable when purchased. You'll either have to pay a premium for "known good" grade NAND, or else add some primitive filesystem and redirection and what's programmed in will vary from PCB to PCB.
SD card makers keep their prices reasonable somehow, right? Technically this component is common between the DRAM idea and the FXPak implementation; they both need enough NAND flash to hold the MSU1 data. It's just that the FXPak (understandably) shifts the cost to the user rather than including it in the product...
Then again, I've noticed that NAND flash seems to have a price bathtub that we're working in the shallow end of...
That's kinda what I thought. I guess this could be taken to mean that the FXPak implementation isn't strictly canonical, and that there's still room for an approach like the one we've been discussing to be more 'authentic'. The emphasis on low latency seems to disregard the word "streaming" in the name, but I figure it still qualifies if you can't run branched code out of it...When I grumped about it, Near replied that he explicitly intended there be two disjoint chunks of memory for the two halves and none of this interleaving access to a fast SD card.
The initial loading isn't a problem for the standard (I think) because the Data Busy bit can simply remain set until the loading is finished. Alternately, the memory controller might be smart enough to let the SNES access the flash repository directly for uncached data, resulting in an FXPak-like situation until the DRAM is loaded. I don't know if this would add too much cost to the memory controller, or potentially too much uneven stress on the flash repository (which, if built-in, would be expected to last for decades)...
...
I suppose it would be more in keeping with the original idea to have a passthrough cartridge with some DRAM and one or two SD card slots. But I like the idea of just shoving everything into a standard-size Game Pak and having it be completely transparent to the user, the way every other add-on chip was handled. Plug and play with no extra fiddling around, and only the logo on the label tells you why you're hearing Red Book audio and seeing brute-forced sprite animation that would shame a Neo Geo.
Re: stretching the Super FX
Here's a crazy idea.
The officially documented CPU ROM option already allows up to 6 MB mapped to banks $80-$FF in HiLoROM fashion in a Super FX cartridge. I just took a look at the S-DD1 memory map, and it looks like it fits nicely in $C0-$FF. Since the S-DD1 appears to have four banking bits, this should allow 16 MB behind the chip plus 2 MB in $80-$BF, for a total of 18 MB of CPU ROM (plus fast decompression if desired), on top of (hopefully) 4 MB of ROM and/or U-ROM for the use of the Super FX.
Now the big question is current draw. With a GSU2 (or GSU2-SP1, which might need less power), an S-DD1, 128 KB of SRAM, 128 KB of F-RAM, 3 MB of PSRAM, and 19 MB of mask ROM, on top of the base console's power requirements, I wonder if this configuration actually works... At least with the MSU1, the option exists to do it as a passthrough device with its own power supply, so it's not strictly necessary to add 4 GB of DRAM and 8 GB of NAND flash to that list...
The officially documented CPU ROM option already allows up to 6 MB mapped to banks $80-$FF in HiLoROM fashion in a Super FX cartridge. I just took a look at the S-DD1 memory map, and it looks like it fits nicely in $C0-$FF. Since the S-DD1 appears to have four banking bits, this should allow 16 MB behind the chip plus 2 MB in $80-$BF, for a total of 18 MB of CPU ROM (plus fast decompression if desired), on top of (hopefully) 4 MB of ROM and/or U-ROM for the use of the Super FX.
Now the big question is current draw. With a GSU2 (or GSU2-SP1, which might need less power), an S-DD1, 128 KB of SRAM, 128 KB of F-RAM, 3 MB of PSRAM, and 19 MB of mask ROM, on top of the base console's power requirements, I wonder if this configuration actually works... At least with the MSU1, the option exists to do it as a passthrough device with its own power supply, so it's not strictly necessary to add 4 GB of DRAM and 8 GB of NAND flash to that list...
Re: stretching the Super FX
Regarding pin 21, it seems to me that it should be fairly easy to test, given a Super FX devcart and the appropriate equipment. All you'd need is a program that sets bit 5 of ROMBR and repeatedly writes to R14, and a way to monitor the voltage on the pin in question. (Or perhaps it'd be better to scan through the banks, so the pin activity has context.) Am I oversimplifying it?
Is anyone in a position to actually test this? I could easily put together a program for this purpose.
Is anyone in a position to actually test this? I could easily put together a program for this purpose.
Re: stretching the Super FX
I don't have a flashcart with a SFX*2* donor, but I can test on a SFX1 and ... maybe a Mario Chip? I can't check at the moment.
Re: stretching the Super FX
Well, fullsnes labels pin 31 on the GSU1/MARIO chip as "?", and there are multiple GSU2 pins with the same label, but it's only pin 21 on the GSU2 that anyone seems to think could be "/CE for 2nd ROM chip".
Any idea what the rationale is for noting that on the GSU2's pin 21 specifically, and not on the other unknown pins?
...
Is there any chance multiple unknown pins on the GSU2 are extra ROM address or chip enable pins? The docs show ROMBR going all the way up to A23...
Any idea what the rationale is for noting that on the GSU2's pin 21 specifically, and not on the other unknown pins?
...
Is there any chance multiple unknown pins on the GSU2 are extra ROM address or chip enable pins? The docs show ROMBR going all the way up to A23...
Re: stretching the Super FX
Did you check if the SFX source happened to be in one of the gigaleaks? Forbidden knowledge etc etc, but would reveal things like this.
Re: stretching the Super FX
I didn't. It might be too late or too dangerous now.
I've seen a couple of lists of what was included, but they may be incomplete. They do mention verilog for the N64, but nothing for the Super FX.
I've seen a couple of lists of what was included, but they may be incomplete. They do mention verilog for the N64, but nothing for the Super FX.
Re: stretching the Super FX
The patents then? https://patents.google.com/patent/US5724497A/en
Re: stretching the Super FX
I don't think a patent will be all that useful in finding a definitive answer to this question. The data in patents tends to be slightly incorrect on purpose, if I recall correctly, providing just enough descriptive detail to protect the actual idea while being confusing enough to prevent theft of the idea before the patent is granted. And an idea like this that can be implemented in a lot of different ways may be described one way in the patent and then actually implemented in a different way - this might not even be obfuscation, but simply the fact that the design process was still ongoing while the patent application was being drawn up.
...
Figure 1 seems to describe a chip with a 20-bit ROM address bus, consistent with pre-GSU2 models of the Super FX. This seems to clash with their claim in the description that "the program ROM typically is four megabytes", although that might refer to SNES games in general.
The pinout description in the text doesn't seem to match any known iteration of the Super FX, and suggests an 18-pin RAM address bus and a 25-pin ROM address bus. I'm not enough of a hardware guy to translate that into bits (I suspect some additional lines are considered part of the address bus here), but if the GSU2 is thought to have a 17-bit RAM bus, this suggests to me that it was intended to have a 24-bit ROM bus, which seems excessive until you realize just how many unidentified pins there are in nocash's GSU2 pinout (plus the SNES-side input address bus on the GSU2 does go up to A23)...
Then again, later in the text it states that "The ROM A bus is a 20-bit ROM address bus.", which is clearly pre-GSU2 (in accordance with Fig. 1) and thus irrelevant to the question of what the GSU2's maximum ROM size is, but does leave me wondering what the rest of P2-P26 are supposed to be for...
...
It repeatedly mentions how much longer ROM access takes than RAM access, and states that the Mario chip is designed to take advantage of this, even though their actual examples assume ROM and RAM both take 6 cycles to access (the real chip should be 3 for both or 5 for both depending on speed setting, unless I'm greatly misled).
It also states that the external clock line is intended to allow the Mario chip to operate at higher speeds than 21 MHz, and suggests that the core and memory controller could be clocked separately, with the buses on the SNES master clock and the core on its own oscillator. (This is almost the opposite of my own "clock trick", namely using a double-speed oscillator and setting the chip to slow mode, which was intended to speed up the memory buses but leave the core at 21 MHz...)
Gotta love patent legalese:
...
Figure 1 seems to describe a chip with a 20-bit ROM address bus, consistent with pre-GSU2 models of the Super FX. This seems to clash with their claim in the description that "the program ROM typically is four megabytes", although that might refer to SNES games in general.
The pinout description in the text doesn't seem to match any known iteration of the Super FX, and suggests an 18-pin RAM address bus and a 25-pin ROM address bus. I'm not enough of a hardware guy to translate that into bits (I suspect some additional lines are considered part of the address bus here), but if the GSU2 is thought to have a 17-bit RAM bus, this suggests to me that it was intended to have a 24-bit ROM bus, which seems excessive until you realize just how many unidentified pins there are in nocash's GSU2 pinout (plus the SNES-side input address bus on the GSU2 does go up to A23)...
Then again, later in the text it states that "The ROM A bus is a 20-bit ROM address bus.", which is clearly pre-GSU2 (in accordance with Fig. 1) and thus irrelevant to the question of what the GSU2's maximum ROM size is, but does leave me wondering what the rest of P2-P26 are supposed to be for...
...
It repeatedly mentions how much longer ROM access takes than RAM access, and states that the Mario chip is designed to take advantage of this, even though their actual examples assume ROM and RAM both take 6 cycles to access (the real chip should be 3 for both or 5 for both depending on speed setting, unless I'm greatly misled).
It also states that the external clock line is intended to allow the Mario chip to operate at higher speeds than 21 MHz, and suggests that the core and memory controller could be clocked separately, with the buses on the SNES master clock and the core on its own oscillator. (This is almost the opposite of my own "clock trick", namely using a double-speed oscillator and setting the chip to slow mode, which was intended to speed up the memory buses but leave the core at 21 MHz...)
Gotta love patent legalese:
This patent was filed in 1994.The Super NES includes within its control deck 20, a 16-bit host CPU which may, for example, be a 65816 compatible microprocessor.
Re: stretching the Super FX
So basically Randy Linden says that the doom cartrige had a max of 2MB rom - but not really confirming its the max of the FX chip
https://www.youtube.com/watch?v=BIauSQ_ ... u4V4AaABAg
https://www.youtube.com/watch?v=BIauSQ_ ... u4V4AaABAg