Window shapes demo

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
Post Reply
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Window shapes demo

Post by UnDisbeliever »

I have written some hard-coded HDMA tables that use a single window to draw some shapes onto the backdrop. I'm going to use them for a Single Window Examples page on the SNES wiki (as requested on Discord).

The demo shows off a hard-coded:
  • Rectangle
  • Trapezium
  • Triangle pointing right
  • Octagon
  • Diamond
  • Circle
Is there any other single-window programmatic shapes that people would like to see on the wiki?


The demo uses a single HDMA channel (in 2-registers write once mode to WH0 and WH1), which does complicate things a bit. Hopefully the comments in the code make sense. Angled lines are calculated using an 8.8 fixed point delta-X per scanline, which should easily translate to 65816 code.

The window effect is achieved using colour math addition and a fixed-colour source. Clipping colours to black inside the window and disabling colour math outside the window. Resulting in a fixed-colour window (0 + fixed-color = fixed-color).


Source code on GitHub (MIT Licensed)
Attachments
window-shapes-single.jpeg
window-shapes-single.sfc
(128 KiB) Downloaded 37 times
User avatar
jeffythedragonslayer
Posts: 344
Joined: Thu Dec 09, 2021 12:29 pm

Re: Window shapes demo

Post by jeffythedragonslayer »

Cool. I'd like to see the five-pointed star and ellipse.
iNCEPTIONAL

Re: Window shapes demo

Post by iNCEPTIONAL »

I don't know what a programmatic shape is, but how about a sihouette of a pawn chess piece, ala.
59-595676_silhouette-chess-piece-remix-pawn-pen-clip-arts.png
And the famous optical illusion:
Rubins-vase-sometimes-referred-to-as-The-Two-Face-One-Vase-Illusion-depicts-the_Q640.jpg
It's kinda crazy that those can be achieved with just one window/shape mask. So you have to wonder just what complex shapes and images could actually be drawn with two....

That's why I really do think of the window/shape masks like having two simple extra layers to play with, because you could have say two simple human silhouettes "walking" past very close to the camera in front of everything in some kind of crowd scene or something like that for extra parallax, or a couple of additional simple white or black chess pieces on a giant chess board using multiple background layers and sprites to have all the pieces scrolling past in perspective and parallax (a bit like that one level in Battletoads on SNES but with stuff right next to the camera too), and that kind of thing.

The other interesting thing with using the fixed colour for me is that it could be constanlty changed every scanline to even give that chess piece the illusion of some 3D volume and hard shadows in places too on the y axis.

Thanks for creating the wiki example by the way.
User avatar
jeffythedragonslayer
Posts: 344
Joined: Thu Dec 09, 2021 12:29 pm

Re: Window shapes demo

Post by jeffythedragonslayer »

"programattic shape" sounds like software is generating the shape on-the-fly and it's not hard-coded. That would be cool for shapes that change their shape or size over time or in response to in-game events.
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

jeffythedragonslayer wrote: Thu Jul 21, 2022 11:14 pm Cool. I'd like to see the five-pointed star and ellipse.
A non-rotated ellipse is easy, for a pre-calculated HDMA table. Multiply the left-right offset of the circle by an x-scale. If you look at my source code you'll see the formula I used for a circle, which includes an x_scale variable (which I used to aspect-correct the circle).

A five pointed star is going to require two windows. I'm not going to draw one the moment, but I will include a star when I start working on the two window demos.


iNCEPTIONAL wrote: Fri Jul 22, 2022 12:48 am I don't know what a programmatic shape is, but how about a sihouette of a pawn chess piece, ala.
In this context, a programmatic shape is a shape that can be created using a simple formula or algorithm. Generating the window table takes CPU time and the simpler you can make the code, the faster it will run.

The last time I experimented with windows I used ~10-30% of a frame's CPU time to draw a circle (which is perfectly fine for a fadeout or a sparse level) and ~40-65% CPU to draw a large triangle (which is way too much for game-play, but perfectly fine for a title-screen or cutscene).

For more complicated shapes, like the pawn chess piece, it is better to use a pre-calculated table. It will take up more ROM space, especially if you want to scale or animate the window. Translating (moving) the pre-calculated tables will cost CPU time, I'm guessing < 20% CPU time for a single 200 scanline window. Won't know for sure until I code one.

Another advantage of pre-calculated tables is that they can be built from image files. A programmer can write a simple program that takes an image file, computes the left/right position of each scanline and outputs table data for a game.

After I've written the wiki page, I'll work on a translated window demo (using chess pieces as my window source). I'm curious to see how much CPU time it would consume.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: Window shapes demo

Post by 93143 »

UnDisbeliever wrote: Fri Jul 22, 2022 3:02 am~40-65% CPU to draw a large triangle
How? Isn't unrolled Bresenham reasonably fast?
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

93143 wrote: Fri Jul 22, 2022 3:29 am
UnDisbeliever wrote: Fri Jul 22, 2022 3:02 am~40-65% CPU to draw a large triangle
How? Isn't unrolled Bresenham reasonably fast?
I never got around to fully optimising the code. There's no loop unrolling and there may be some potential optimisations I missed. I based the code off Bresenham's line drawing algorithm and it can be found here. Most of the slowdown comes from using 16 bit variables for the left/right position and from clamping 16-bit values to 8-bit values (to correctly draw partially-offscreen windows).


For reference, the following triangle uses 49.86% CPU on SlowROM and 42.10% CPU on FastROM according to Mesen-S's Performance Profiler.
Attachments
hdma-triangular-window-test.png
hdma-triangular-window-test.png (5.01 KiB) Viewed 1860 times
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

I forgot to include a window-is-offscreen test to the demo. This can result in a 1-pixel line on the left or right side of the screen.

The test is a simple one; if the calculated left > 255 or calculated right < 0, then the window is considered offscreen.

I have decided against including an optimised line drawing example pseudo-code on the Drawing window shapes wiki page. Adding the offscreen test complicates the code a lot. I will include some basic line-drawing pseudo-code so people can understand how the HDMA table is built, it just won't be optimised towards SNES code.

window-shapes-single-v2.sfc
(128 KiB) Downloaded 29 times
Source code on GitHub (MIT Licensed)

before
before
after
after
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

I have coded a precalculated horizontally-symmetrical single-window demo.

This demo draws a double-buffered HDMA table from a list of half-width bytes with offscreen testing and clamping.

To correctly preform the 8-bit clamping and offscreen tests, the DrawSymmetricalWindow subroutine is split into three code paths:
  • 0 <= xCentre <= 255: No offscreen test is required. The left position is clamped to 0 and the right position is clamped to 255.
  • xCentre < 0: The window could be offscreen, but only on the left side. Only the right position is calculated, if it is >= 0 the window is onscreen (left=0). If xCentre is > -256, the calculation can be preformed with an 8 bit A and the carry flag can be used to test if the window is offscreen or not (no comparisons required). No clamping is required.
  • xCentre > 255: The window may be offscreen on the right side. Only the left position is calculated, if it is < 256 the window is onscreen (right = 255). Like the previous code-path, the offscreen test can be preformed by checking the carry flag. No clamping is required.
This demo uses ~28 scanlines (~10.7%) of CPU time when the 140 scanline tall window is completely onscreen (according to Mesen-S's Performance Profiler). The maximum CPU time profiled was ~28.7 scanlines (~11%).

I think this is the first time I have used the WMDATA register outside of memset and memcopy.

Source code on GitHub (MIT Licensed)
Attachments
window-precalculated-symmetrical.zip
(1.38 KiB) Downloaded 23 times
window-precalculated-symmetrical.jpeg
window-precalculated-symmetrical.jpeg (10.37 KiB) Viewed 1448 times
none
Posts: 117
Joined: Thu Sep 03, 2020 1:09 am

Re: Window shapes demo

Post by none »

I have an implementation here that has hardware acceleration for moving a shape using the two windows around both vertically and horizontally with clipping and it does not need to double buffer the HDMA table.

https://github.com/rmn0/rem/blob/featur ... c/window.s

If the window clips at the top, it seeks in the HDMA table to find the first visible scanline (this step can be skipped if you have an entry in the table for every row, but this one doesn't to save on ROM space). Otherwise, it moves the window down by starting HDMA late with IRQ.

For horizontal movement with clipping, it uses indirect HDMA where the indirect table is a 256 byte buffer that contains a mapping from table to screen coordinate. This mapping can be generated with DMA memcpy and memset, basically.

It's a lot faster than scrolling / clipping in software because DMA is a lot faster than the CPU and also the four window coordinates can share a single indirect table (I think it was around 2~3 scanlines but I forgot the exact numbers, it's a while ago).
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

none wrote: Wed Aug 10, 2022 1:20 am For horizontal movement with clipping, it uses indirect HDMA where the indirect table is a 256 byte buffer that contains a mapping from table to screen coordinate. This mapping can be generated with DMA memcpy and memset, basically.
That is ingenious. I would never have thought of using indirect HDMA for byte mapping.


I would like to add this technique to the SNES wiki (sometime next month, when I get around to writing an Indirect HDMA examples page). What kind of credit would you like me to give? and would it be OK to add a link to your code to the wiki?
none
Posts: 117
Joined: Thu Sep 03, 2020 1:09 am

Re: Window shapes demo

Post by none »

Sure, go ahead. You can just link to the git repository, that's fine with me but you should probably use a permalink.

https://github.com/rmn0/rem/blob/f496e1 ... c/window.s
UnDisbeliever
Posts: 124
Joined: Mon Mar 02, 2015 1:11 am
Location: Australia (PAL)
Contact:

Re: Window shapes demo

Post by UnDisbeliever »

I have coded a precalculated non-symmetrical single-window demo.

This one is a little simpler then the previous demo. There are only two code paths; one for negative x-offset (moving left) and another for positive x-offset (moving right). Splitting the code in two allowed me to preform all horizontal calculations with an 8 bit accumulator.

The offscreen test is required for both code paths. Since the x-axis direction is known, only one offscreen test is required per scanline. Testing if right < 0 when moving left and left > 255 when moving right.

This is what happens if the offscreen test is not implemented (a 1pixel wide glitched window on the left side of the screen):
no-offscreen-test-left.jpeg
no-offscreen-test-left.jpeg (7.13 KiB) Viewed 1044 times


For those interested, this demo uses ~29.5 scanlines (11.25% CPU) for a positive x-offset and ~32.25 scanlines (12.4% CPU) for a negative x-offset (as profiled in Mesen-S when the 140 scanline tall window is fully onscreen).

Source code on GitHub (MIT Licensed)
Attachments
window-precalculated-single.jpeg
window-precalculated-single.jpeg (8.63 KiB) Viewed 1044 times
window-precalculated-single.zip
(1.44 KiB) Downloaded 22 times
Post Reply