Posted: Mon Aug 16, 2010 1:05 pm
It's not the 65816 nor the PPU that bottlenecks the system, it's interface between the 65816 and the PPU that bottlenecks both the 65816 and the PPU.
That's a good point which I wasn't thinking of. Though I think that Nintendo was in such a powerful position after the Famicom that they didn't need to baby the developers. They should have just chosen the best CPU and other hardware components they could have.blargg wrote:Or developer compatibility. By using a similar scheme as the NES, all the NES developers would be able to pick up the SNES more quickly. Same for the graphical scheme, which is quite similar.Who knows, maybe Nintendo went with the 65816 rather than the 68000 because they had envisioned NES compatibility.
That depends on what you mean by bottleneck. The 65816 certainly is a barrier as you don't have a whole lot of CPU time. I'm not certain about the CPU<->PPU interface (I assume you mean speed of DMA) is a huge issue. I mean everyone would love even faster DMA so you could update even more tiles and background data each frame.psycopathicteen wrote:It's not the 65816 nor the PPU that bottlenecks the system, it's interface between the 65816 and the PPU that bottlenecks both the 65816 and the PPU.
It still would have slowed developers, no matter how much market force Nintendo had. Slowing their developers would weaken their position in relation to the competition. With the 65816, a lot of code could be ported very easily, with minimal changes.MottZilla wrote:That's a good point which I wasn't thinking of. Though I think that Nintendo was in such a powerful position after the Famicom that they didn't need to baby the developers. They should have just chosen the best CPU and other hardware components they could have.blargg wrote:Or developer compatibility. By using a similar scheme as the NES, all the NES developers would be able to pick up the SNES more quickly.
Are you kidding me? You've seen how cpu intence my game is, and that doesn't even take half the availeable CPU time, and I'm not doing that much optimization either.MottZilla wrote:
That depends on what you mean by bottleneck. The 65816 certainly is a barrier as you don't have a whole lot of CPU time.
Sort've. I was refering to the Snes's redundant register set. They could've had a lot less PPU hardware registers. It pulls my hair out whenever I need to use the PPU.I'm not certain about the CPU<->PPU interface (I assume you mean speed of DMA) is a huge issue. I mean everyone would love even faster DMA so you could update even more tiles and background data each frame.
Comparing the SVP to the SuperFX would actually be more fair (isn't the SuperFX faster than the SA-1?).byuu wrote:There's also the SA-1, if you count it. Out of I-RAM, it's two clocks per cycle, so you can tripe your theoretical instruction counts. But more often than not you need lots of BW-RAM, which is four clocks per cycle. Could compare to the SVP to make that more fair.
No, there isn't, if you want to load an entire frame on the Mega Drive (assuming no cropping and such), you're looking for 4 frames per update (well, on NTSC at least, PAL has over double the amount of transfer rate in blank x_X). That's why you're meant to use sprites, tilemaps, etc. =Pbyuu wrote:CPU<>PPU being a bottleneck sounds silly. Maybe for 256x240 output on NTSC, sure. But you can always crop the screen vertically and get a lot more time, SFA2 and similar games do this. There's enough bandwidth to play full motion video that blows away even the Mega CD. Just not enough ROM space for it.
Super FX is heavily specialized for 3D graphics, as is the Sega Virtua Processor. The SVP, which appears to be based on a Samsung SSP1601 core, is clocked at roughly the same speed as the FX2. SA-1, on the other hand, is a general-purpose application coprocessor, as are the SuperH family CPUs in the 32X, Saturn, and Dreamcast.Sik wrote:Comparing the SVP to the SuperFX would actually be more fair (isn't the SuperFX faster than the SA-1?).
A full frame in either console's 256-pixel-wide mode is 256x224x4bpp, or 28 KiB. I've been quoted 7 KiB per vblank for DMA copying on SNES too. But if you cut the full-motion video to a cinematic display aspect ratio with 160 lines, I don't see why you can't easily fit 30 fps with an external FMV decoder. Imagine a coprocessor that decodes WebM from an SD card soldered onto the cartridge.if you want to load an entire frame on the Mega Drive (assuming no cropping and such), you're looking for 4 frames per update (well, on NTSC at least, PAL has over double the amount of transfer rate in blank x_X). That's why you're meant to use sprites, tilemaps, etc. =P
You don't even want to know the bandwidth without DMA.Byuu, from what I know you're kind of authority when it comes to SNES, so can you give some numbers regarding VRAM bandwidths per line/frame with and without DMA ...?
The SuperFX is garbage. It is specialized to the point of absolute insanity. The SA-1 by comparison is a full general-purpose CPU with lots of nifty tools like bitmap<>bitplane conversion, H/V/counter IRQs, vector address override, RAM/ROM protect, two distinct DMA modes, etc.Comparing the SVP to the SuperFX would actually be more fair (isn't the SuperFX faster than the SA-1?).
I guess that explains this (make sure it seeks to 6:36):tepples wrote:The SVP, which appears to be based on a Samsung SSP1601 core, is clocked at roughly the same speed as the FX2.
Still my point stands, the SVP is more akin to the SuperFX than to the SA-1.byuu wrote:The SuperFX is garbage. It is specialized to the point of absolute insanity. The SA-1 by comparison is a full general-purpose CPU with lots of nifty tools like bitmap<>bitplane conversion, H/V/counter IRQs, vector address override, RAM/ROM protect, two distinct DMA modes, etc.
Hehe, must be really slow thenbyuu wrote:You don't even want to know the bandwidth without DMA.
This does not sound that bad at all IMO... it is still slower than MD in 256 pixel modes ~9.5KB/f for 60Hz, ~18KB/f for 50Hz but not too major.byuu wrote:But with it ... you have effectively 1324 cycles per scanline. DMA consumes 8 cycles per byte transferred. NTSC mode has 262 scanlines @ 60hz, PAL mode has 312 scanlines @ 50hz.
My numbers are in bytes per frame. So for your average 256x224 NTSC game, that gives you 6.28K/f. For your average PAL game, that gives you 11.9K/f. NTSC can use 256x240 at 3.6K/f, and PAL can use 256x224 at 14.56K/f.
You can however disable the screen using force blank, and take lines off the top and bottom, which allows you to increase bandwidth further. Say you cut an 8-pixel row off the top and bottom, which would be lost to overscan anyway, you can get 8.93K/f for NTSC. The video rendering trick cuts off even more lines, and uses page-swapping at 20-30fps to double or triple that rate.
Without the active display getting in your way, the maximum bandwidth is 43.36K/f, or 2.68M/s.
Those are SuperFX instructions, but yes. I've tried to make a juvenile program, but haven't had much luck.Sik wrote:EDIT: talking about SA-1, these instructions are listed in the docs: http://srb2town.sepwich.com/junk/lolinstructions.PNG
(just mentioning for the sake of it, couldn't miss the chance XD)
Code: Select all
SexyPlot:
STOP
LINK
WITH
CACHE
AND
HAVE
SEX
OR
; ... I got nothin'
ShakespeareanFilter:
TO
STOP
OR
NOT
TO
STOP
; That is the question, that I might ask.UNIX is the same way: Not exactly safe for work (UNIX counterpart to Windows chkdsk is close to an indecent word)byuu wrote:Something like:Code: Select all
SexyPlot: STOP LINK WITH CACHE AND HAVE SEX OR ; ... I got nothin'