Max colour output

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
User avatar
dougeff
Posts: 3078
Joined: Fri May 08, 2015 7:17 pm

Re: Max colour output

Post by dougeff »

Yeah. Even with dither the 4x4x4 picture isn't nearly as good as 5x5x5. I guess I was wrong.
nesdoug.com -- blog/tutorial on programming for the NES
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Max colour output

Post by lidnariq »

Molive wrote:However if you really wanted to show off the 4-4-4 colour space you'd have to use a really high colour image which ends up using more than 256 colours from that space. Currently the bunny one probably uses less than the one from the pallette mode.
The bunny picture, when scaled down to 256x224 and truncated to 15bpp, has 2k unique colors in it. Yes, that's "only" 1/16th of the full gamut. It's also 8x256.

From the other extreme, you can look at the decade of PC demo effects that used the VGA (and SVGA)'s paletted video modes to know that wide breadths of hues don't have any correlation with color depth. In fact, the biggest problem is huge varieties of desaturated colors, nothing like the garish image there from Overdrive 1.

Quite frankly, I've put a lot of effort into figuring out how to get good photographs displayed on the SNES, and the 8bpp paletted mode is close to 99% of the image quality for 5% of the effort and 2% compromise. Dithering is required regardless of whether it's an 8bpp palette or 15bpp directcolor.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: Max colour output

Post by Drew Sebastino »

This is hypothetical and thus not useful, but the sad thing is, if the SNES had 128KB of VRAM as originally intended, the S-PPU could have been made to support something like a 16bpp bitmap mode, as it has just enough graphics bandwidth. (Although you'd only have 8KB for sprites, assuming the bitmap is 256x224.

Edit: 16KB for sprites; I forgot that while you'd only have 1/8 of total VRAM, you would have 128KB instead of 64KB.
93143 wrote: It would probably have cost more silicon than they wanted to spend on a feature that was already way beyond anything their competition could do...
Including everything but the most powerful arcade machines even; it's weird how feature packed the SNES was in comparison to most everything else for how starved for bandwidth it is...
psycopathicteen wrote:I could still can tell shades between colors in 15-bit colors most of of the time.
Yup; just look at how the sky gradient in the first level of DKC jitters up and down for blending. 12bit color is much better than 9bit color, but even it can be pretty ugly. Just look at how the only upgrade beside sound hardware that Capcom made with the CPS2 was upgrading the color from 12bit to 16bit. (12RGB with an additional 4bits of brightness; probably easier to implement or something...)
Last edited by Drew Sebastino on Mon Sep 24, 2018 7:45 am, edited 1 time in total.
Molive
Posts: 50
Joined: Sat Apr 07, 2018 7:39 pm
Location: EN

Re: Max colour output

Post by Molive »

Ah, sorry lidnariq, I didn't know. Thanks for telling me.
That makes a lot of sense, and I think I'll do most of what I need in 8pp, with some in mode 5 and like one or two really high colour images.

Thanks guys,
Molive
SNES demos are great
93143
Posts: 1715
Joined: Fri Jul 04, 2014 9:31 pm

Re: Max colour output

Post by 93143 »

You can of course change up to 8 arbitrary CGRAM entries per scanline with HDMA. A few years ago I wrote a scheduler in Matlab that starts with the top 256 colours (or fewer if I need space for something else) and goes down the picture changing palette entries that aren't currently needed when out-of-palette colours show up. If it can't find a free HDMA slot, or a single line has too many colours for the palette, it fails and I have to go re-quantize to a lower total colour count and try again. Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...

With the above method, I managed 417 colours in a photograph and nearly 600 in a converted title screen backdrop. IMO both looked much closer to the 15-bit RGB version than to the 256-colour version. Performance depends heavily on the material, and also on the quantizer and its settings. IIRC it tends to perform worse on dithered images, which makes sense. I could probably improve the scheduling algorithm if I tried, but I'm really busy and it's good enough for my purposes...

This is admittedly a lot of effort for a modest gain. I developed this method because I'm working on a project that involves a small number of high-colour images, and I wanted them to look as good as possible.

...

Note that the quantizer can make a big difference in the quality of the picture even when just using 256 colours. I use a free standalone program called "Color quantizer" for most things because if you tweak it right you can usually get better results than with the GIMP. There may be a better program out there that I'm unaware of or don't own (Photoshop?).

Also note that, at least in the software I'm used to, reducing the bit depth to 555 from 888 is a separate operation from quantizing to a certain palette size, and your results will vary depending on which one you do first (and which one you do with dither - Color quantizer can load custom .pal files, so you can use its features when bit-reducing to 555 by loading a palette file that corresponds to 15-bit RGB). If you do palettization first, you may end up wasting a lot of palette space on a bunch of colours that end up identical to each other once bit-reduced; if you do the bit reduction first, you have to either dither at the same time (meaning you'll be palettizing a pre-dithered image) or get stuck with banding. And then you may have to bit-reduce again to make sure your image is still 555 (Color quantizer allows you to prevent colour mixing when performing adaptive quantization, but this can harm image quality). If your software can bit-reduce, dither, and palettize all at the same time, tell me what you're using and where I can get some.

...

It should also be possible to change at least 15 and possibly as many as 20 colours per line with regular DMA in an H-IRQ. But since the palette indices have to be sequential, it's less flexible than HDMA unless you're using 4bpp [cough]Mode5[/cough]. (Sadly, it is not possible to use HDMA to change more than 8 colours per line regardless of their arrangement in CGRAM, because there is no mode that sends four bytes to one address.)
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

Drew Sebastino wrote:This is hypothetical and thus not useful, but the sad thing is, if the SNES had 128KB of VRAM as originally intended, the S-PPU could have been made to support something like a 16bpp bitmap mode, as it has just enough graphics bandwidth. (Although you'd only have 8KB for sprites, assuming the bitmap is 256x224.

Edit: 16KB for sprites; I forgot that while you'd only have 1/8 of total VRAM, you would have 128KB instead of 64KB.
Tiles of 128 bytes for a machine with less than a resulting 6KB of bandwidth (6,32KB, but WRAM, OAM, CRAM), seems a little bit insufficient to me.

All that stuff is a big domino effect. If you want a 65816 with more frequency, you need a WRAM with more frequency, and the rest of the hardware working in consequence too.

128KB of VRAM implies the need of an bigger bandwith, and, in this architecture yo can't get this without higer frequencies for all the pipeline (cpu->wram->vram).

Suddenly i have the need of open a thread about this topic (in terms of hardware and its availability according to date).
creaothceann
Posts: 610
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: Max colour output

Post by creaothceann »

Wasn't the restriction to 2.68MHz because of the slow ROM chips of the day + the 8-bit data bus?

A 16-bit data bus is needed, but that doesn't seem possible with a stock 65c816 CPU.
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

creaothceann wrote:Wasn't the restriction to 2.68MHz because of the slow ROM chips of the day + the 8-bit data bus?
Yes, presumably the WRAM forces the DMA to adapt its frequency to 2,68mhz, i don't know if the DMA actually works at 3,58mhz of stock.
creaothceann wrote:A 16-bit data bus is needed, but that doesn't seem possible with a stock 65c816 CPU.
The SA-1 of the cartridges has an 16 bit bus data to comunnicate with the internal ram of the own cartridge.

Probably, that SA-1 inside of a snes could have been able an 16 bit bus data to comunnicate with the WRAM, but i don't know if it would alter the rest of DMA buses, or the continuity of the bus with the rest of the components.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

93143 wrote:You can of course change up to 8 arbitrary CGRAM entries per scanline with HDMA. A few years ago I wrote a scheduler in Matlab that starts with the top 256 colours (or fewer if I need space for something else) and goes down the picture changing palette entries that aren't currently needed when out-of-palette colours show up. If it can't find a free HDMA slot, or a single line has too many colours for the palette, it fails and I have to go re-quantize to a lower total colour count and try again. Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...
I actually got a pretty decent processor going with several days of work that does that harmonious quantization you're wondering about.

The idea I went with was to originally quantize images based on bucketing all of the pixels, determining a bounding volume across the RGB 'axes', and then splitting the bucket with the widest bounding volume into two buckets. Repeat until the max palette size is hit. More or less how most modern palettizing processes operate these days.

The extension to support HDMA I did was to continue splitting buckets, sort of treating the scanline as an additional axis to split on:

-Once the number of buckets is more than the size of the base palette, identify the first and last scanline that each bucket of pixels has
-Across the current list of buckets, identify all potential HDMA candidates by seeing if one bucket's first scanline was greater than some other bucket's last scanline, flagging each one as such
-If the number of buckets that don't get roped in via HDMA are less than the max palette size, then split based on colour, and start again, to re-evaluate potential HDMA candidates (i.e. splitting on colour may have resulted in new HDMA possibilities)
-If we're at the max palette size, though, then over the full list of buckets, find one with a gap of scanlines where a pixel does not exist, and find some other bucket that could be a candidate for hdma, e.g. checking the following

Code: Select all

	// check if hdmaCandidate's final scanline is before our first possible scanline
	// this only takes into account bucket pairs that are like so:
	// -|||-------|||----- <- Bucket - split this along the scanline
	// ------|||---------- <- hdmaCandidate?

	// check if hdmaCandidate's first non-gap sequence (that we can see) is within 
	// bucket's gap. this takes into account bucket pairs like so:
	// -|||-------|||----- <- Bucket 
	// ------|||------|||- <- hdmaCandidate?
...if an hdmaCandidate was found, then the bucket being evaluated is split across the scanline. Afterwards, start this loop again.

And repeat until no more hdma candidates are discovered.

It's still a little slower than I'd like, in particularly due to one chunk of the algo that basically runs O(n^3) with number of buckets, but with maxcolours of 256 and maxhdmachannels of 8, I'm able to run it over the Kodak image library at an average of ~1.9 seconds per img (single-threaded; ofc processing them in bulk, it's trivial to parallelize). Also, the calculation of the colour deltas for buckets could be a lot better, e.g. the colour deltas are all computed in 8b per channel, not 5b, which is why even for the "0 hdma" case shown below, there aren't actually 256 colours in use, because there were, apparently some duplicates after final quantization, so we must have done some bucket splits that were redundant and may have been better suited on some other stuff.

The following PSNR values are compared against versions of the source images that were scaled down to SNES' max height or width, corrected for the different pixel size, and direct quantization to R5G5B5. The colour counts are all measuring unique quantized colours as well; there's almost assuredly some cases where one colour may get HDMA'd out and then get HDMA'd in again, but that still counts as one colour.

Image

Image

and some of the more notable comparisons (left-to-right: 'original', 15bpp quantized original, 8-channel hdma, 0-channel hdma):

#03:
Image Image Image Image

#15:
Image Image Image Image

#23:
Image Image Image Image

If anyone is interested, I'm willing to post the source code up somewhere for perusal. There are some bits of ISPC code and the afore-mentioned parallelism is provided via Msft's ConcRT library, so it's not super platform agnostic, but if anyone is interested in a different take on this problem, I'd be happy to oblige.
Attachments
show-hdma-8ch.sfc
Kodak Image slideshow with 8-channel HDMA palette update
(2 MiB) Downloaded 197 times
show-hdma-1ch.sfc
Kodak Image slideshow with 1-channel HDMA palette update
(2 MiB) Downloaded 188 times
show-hdma-0ch.sfc
Kodak Image slideshow with no HDMA palette update
(2 MiB) Downloaded 188 times
Last edited by CypherSignal on Wed Sep 26, 2018 11:13 am, edited 2 times in total.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Max colour output

Post by lidnariq »

Is the "original" truncated to 15bpp? Or just scaled down to 256x224?
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

Those "originals" are still at 24bpp, so you can't directly check my math on that :P I don't have the 15bpp versions handy, as the PSNR-calc just does the quantization against the original at the same time whilst it is comparing against the output image data.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Max colour output

Post by lidnariq »

I mean, for quantization purposes that's clearly correct. But I was just hoping for something for visual comparison :)
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

That 15bpp are really possible in snes?, How many KB's occupies?.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

Señor Ventura wrote:That 15bpp are really possible in snes?, How many KB's occupies?.
The 15bpp images are only provided for reference/comparison, as per lidnariq's request.
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

CypherSignal wrote:
Señor Ventura wrote:That 15bpp are really possible in snes?, How many KB's occupies?.
The 15bpp images are only provided for reference/comparison, as per lidnariq's request.
So, the two of the snes are the two of the right, right?.
Post Reply