stretching the Super FX

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: stretching the Super FX

Post by Dwedit »

The SNES DOOM programmer said that by doing a horizontal stretch operation on the SNES rather than on the Super FX chip, you can basically double the Framerate of Doom. Too bad he didn't know that in 1994.

So that would be "stretching" the Super FX.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: stretching the Super FX

Post by tokumaru »

I'm fairly certain that he didn't say the frame rate would double, seeing as even though the game would be drawing half as many pixels, all the operations done to calculate the color of each pixel would remain the same. Plotting each pixel once instead of twice would certainly save some time, but I doubt that the performance boost would be as drastic as 100%.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

M2m wrote: Mon Sep 26, 2022 5:57 am So basically Randy Linden says that the doom cartrige had a max of 2MB rom - but not really confirming its the max of the FX chip

https://www.youtube.com/watch?v=BIauSQ_ ... u4V4AaABAg
Oh hey, that's new.

Yeah, I'm not sure that's definitive. It certainly matches the conventional wisdom; it's what everyone has thought up until now. The known pinout supports 2 MB, and to my knowledge Nintendo never offered more. The unknown pinout, though, is what I'm curious about, and while it is probable that Nintendo knew what pin 21 did, there's no reason to suppose they (or Argonaut) told anyone else.

The extant manual, which is from late in the system's lifespan, lists a cartridge configuration with twice the Game Pak RAM the Doom cartridge had, as well as 6 MB of extra ROM for the SNES CPU. This would have resulted in a far more expensive cartridge than the Yoshi's Island configuration Randy describes as being the only one more expensive than the Doom configuration at the time the latter was chosen. It still only supports 2 MB for the main Super FX ROM - but the memory map has a mirrored region that suggests the possibility of 4 MB. We still don't know whether pin 21 is "/CE for 2nd ROM chip" as nocash suggests, but either way that design is begging for it...

...

That devcart description is interesting. It suggests that you can rig a real Super FX cartridge to allow writes to Game Pak ROM. This implies that in principle, CPU ROM or an S-DD1 (or an MSU1) could be used to circumvent the ROM limits on data the GSU needs access to, for level data if nothing else (a bank or two should do it for level data, but swapping textures would require a large chunk of the address space to be expensive RAM).

...

I'm not sure Mode 7 would have worked as well for the automap. It would probably have been pixelated while zoomed in, for one. And even when not zoomed in, nearest-neighbour texture mapping of single-pixel lines doesn't tend to have nearly as good a result as Bresenham line drawing.

Dwedit wrote: Mon Sep 26, 2022 1:04 pm The SNES DOOM programmer said that by doing a horizontal stretch operation on the SNES rather than on the Super FX chip, you can basically double the Framerate of Doom. Too bad he didn't know that in 1994.

So that would be "stretching" the Super FX.
...I think he might have gotten that idea from me. It sounds like he's talking about the mosaic trick.

Also, reading that old stuff I'm starting to think I was perhaps a bit hard on the port for the player getting stuck on geometry. After having played the PC version a bunch, I think the SNES port is still worse, but not by as much as I thought. Trying to sneak behind the exit teleporter in E3M2 and then sneak back out is a great example: it's entirely possible to get totally stuck on nothing and have to back off and try again. This is on the PC version...

tokumaru wrote: Mon Sep 26, 2022 4:22 pm I'm fairly certain that he didn't say the frame rate would double, seeing as even though the game would be drawing half as many pixels, all the operations done to calculate the color of each pixel would remain the same. Plotting each pixel once instead of twice would certainly save some time, but I doubt that the performance boost would be as drastic as 100%.
The mosaic trick - setting mosaic to 2x2 and exploiting scroll HDMA to double pixels horizontally but not vertically - has sort of a neat feature that plays well with how Doom renders. Or rather, it doesn't play well exactly, but it does a bit of sanding to better fit the square peg of Doom's rendering approach into the round hole of SNES CHR format.


First, some background (in case you're not tired of me repeating this yet, because apparently I'm not...). The Super FX, if I understand it correctly, is much better at horizontal rasterization than vertical, because it plots directly into SNES CHR format in which each byte spans eight pixels in a row. The Super FX caches one such row, or "sliver" as some call it, in each of a pair of pixel caches: a primary cache for fast plotting, and a secondary cache to write the results to RAM while a new sliver is being addressed in the primary cache. If you fill the primary cache, or plot outside it, it dumps to the secondary cache, which starts updating the framebuffer in the background while you plot more pixels into the primary cache. But if the secondary cache is still busy with a previous sliver when the primary cache is invalidated, the whole core stalls until the secondary cache finishes its job and the cache transfer can occur.

In high-speed mode at 8bpp, which is what Doom uses, once you fill the pixel caches you're bottlenecked by I/O at (unless I've gravely misunderstood the chip) 80 cycles for an unfinished sliver regardless of whether you draw one pixel or seven. The reason is that if you want to keep any of the pixels that are already in RAM, the Super FX has to read and then write all 8 pixels, at 5 cycles per byte each way, because every framebuffer byte contains one bit from every pixel in the sliver. This means that vertical rasterization is horribly slow, since you can only draw one pixel (or in SNES Doom's case, two identical pixels) before you have to trigger a new cache flush. And Doom renders a lot of stuff in columns. If my calculations are correct, even at SNES Doom's low resolution, a viewport full of nothing but wall costs three and a half PPU frames just to texture map it.

This also suggests that the pixel doubling in SNES Doom is effectively free, since just drawing one pixel per sliver instead of two would still bottleneck at 80 cycles for the sliver.


But with the mosaic trick, the tile format does something interesting - the second pixel in a sliver appears directly below the first when displayed onscreen, and the same for the fourth and third, the sixth and fifth, and the eighth and seventh. In other words, when drawing columns you can now draw two unique double-wide texture-mapped pixels before carriage return/pixel cache flush, which halves the cycle cost per pixel if it's bottlenecked by I/O. Since this method also saves over 15 KB of DMA, I think it could easily save a couple of PPU frames in scenes with a lot of column drawing.

It does make floor drawing somewhat suboptimal (20 cycles per stretched pixel instead of 10 per dithered pixel pair for long runs, unless you can rasterize two rows at once) and it kills the possibility of checkerboard dither, but it's probably worth it. Especially if the floor is textured, which wouldn't fit in 5 or even 10 cycles anyway.

I wouldn't expect a full doubling of speed over the original port just from this, since other stuff has to happen too, but saving potentially 40 cycles per pixel on walls and skies and mask objects is certainly nothing to sneeze at...
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: stretching the Super FX

Post by 93143 »

tl;dr I think Randy's onto something. Probably not a full 100% performance improvement, but potentially a big chunk of that in spots. Vertical rasterization (which Doom is full of) is surprisingly expensive on Super FX, and the trick I think he's talking about can help with that.
Post Reply