What's more processor intensive (if that's the correct terminology) . . .

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
Post Reply
iNCEPTIONAL

What's more processor intensive (if that's the correct terminology) . . .

Post by iNCEPTIONAL »

Actually scrolling part of a background layer or cycling the palette to make it look like it's scrolling (imagine it's a horizontal stream of water flowing across the screen about two thirds of the way down the view)?

Also, imagine I'm doing this multiple times in the view, so changing the scroll speed of the water say every 8-16 scanlines for a parallax effect or having multiple parts of the scene with a few of the palettes cycling for waterfalls, streams, etc.

For a simple example of what I'm talking about: https://youtu.be/r1crAucc1oc?t=777
Last edited by iNCEPTIONAL on Tue Jun 07, 2022 4:22 pm, edited 1 time in total.
Myself086
Posts: 158
Joined: Sat Nov 10, 2018 2:49 pm

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by Myself086 »

Cycling the palette can be done by having 2 copies of your cycling palette and DMA at a different point each frame.

I'd say they're both about 100 cycles per frame (~0.2%).
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by Oziphantom »

changing the H scroll with a HDMA is trivial and takes "nothing" to update.
User avatar
jeffythedragonslayer
Posts: 344
Joined: Thu Dec 09, 2021 12:29 pm

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by jeffythedragonslayer »

"Processor intensive" is fine terminology.
Myself086 wrote: Tue Jun 07, 2022 7:47 am Cycling the palette can be done by having 2 copies of your cycling palette and DMA at a different point each frame.

I'd say they're both about 100 cycles per frame (~0.2%).
"both" is referring to each of the cycling palettes, not both techniques correct?
iNCEPTIONAL

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by iNCEPTIONAL »

OK, so both methods are pretty un-taxing, but [lets say every 8 pixel row] scrolling at different speeds on a background is maybe a tiny bit les taxing than cycling a few palettes?

The reason I ask is because I'm currently designing quite a bit of line/row scrolling in each of the backgrounds in my game and I just want to make sure it might not be better to do some of the scrolling via palette swaps instead (like on the water layer) just to keep and tax on the SNES' processor to the minimum and avoid any potential slowdown.

If you look at the ice level you can see how much line/row scrolling I'm doing on each of the layers, and this is before I do any code for the player, enemies, bullets and whatever other stuff, so just making sure I'm all good: https://youtu.be/ytNUcBYcEuI
turboxray
Posts: 348
Joined: Thu Oct 31, 2019 12:56 am

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by turboxray »

Myself086 wrote: Tue Jun 07, 2022 7:47 am Cycling the palette can be done by having 2 copies of your cycling palette and DMA at a different point each frame.

I'd say they're both about 100 cycles per frame (~0.2%).
But he said every 16 lines (and gave Dead Moon video as an example). 224 / 16 = 14 updates. 100 / 14 updates = 7 cycles. It's definitely more than 7 cycles per update. You need to add an offset to each line to get a scroll speed (usually a fixed point value), store it back in the table. And there's the cost of HDMA itself when it runs for that line. And that assume that there is no "relative" scrolling vertically.. assuming the vertical scrolling is fixed. Because if it's not, then you have to offset (calculate) the hDMA table to accommodate it (which line it's called for), which also costs more cycles (which Dead moon does do). I'm not saying it's breaking the bank, but it's more than 100 cycles. Probably ~2% if you're being conservative with the estimation, cpu resource for the frame.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by 93143 »

Linescroll via palette swap would be quite limiting, particularly in Mode 0. You'd be stuck with snapping to whatever granularity you can map 3 colours to (meaning there are only 3 possible scroll positions), and you'd need to reserve a palette for each scroll speed. It might work okay with 4bpp graphics (particularly on a PC Engine where there are 16 BG and 16 sprite palettes instead of 8 and 8 like on SNES), but you're using 2bpp so...

Linescroll via HDMA is far more flexible and still very cheap. I don't think this is something where you should really be worried about slowdown. It's nothing compared to running AI and collisions.

For waterfalls and such, you could in principle use vertical scroll with windowing, but the non-silly options are probably palette cycling and tile animation, with the latter being more expensive because it takes a nontrivial amount of DMA. It's still only about 0.3% of a frame per updated 2bpp tile, but since you can't DMA during active display, the relevant number is actually ~2% of VBlank per 2bpp tile.
iNCEPTIONAL

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by iNCEPTIONAL »

93143 wrote: Tue Jun 07, 2022 4:21 pm Linescroll via palette swap would be quite limiting, particularly in Mode 0. You'd be stuck with snapping to whatever granularity you can map 3 colours to (meaning there are only 3 possible scroll positions), and you'd need to reserve a palette for each scroll speed. It might work okay with 4bpp graphics (particularly on a PC Engine where there are 16 BG and 16 sprite palettes instead of 8 and 8 like on SNES), but you're using 2bpp so...

Linescroll via HDMA is far more flexible and still very cheap. I don't think this is something where you should really be worried about slowdown. It's nothing compared to running AI and collisions.

For waterfalls and such, you could in principle use vertical scroll with windowing, but the non-silly options are probably palette cycling and tile animation, with the latter being more expensive because it takes a nontrivial amount of DMA. It's still only about 0.3% of a frame per updated 2bpp tile, but since you can't DMA during active display, the relevant number is actually ~2% of VBlank per 2bpp tile.
Well, it's not that I worry the row/line scrolling would cause slowdown in and of itself but more that I'm doing more of it than I see in most SNES games, so I just want to make sure I'm not at risk of hogging up too much of the processor resources doing a bunch of row/line scrolling on each of the four backgrounds only to find it causes a problem when I then get to all the stuff I want to do with sprites and the like.

It's just hard for me to get a sense of how much of the available processor resources I'd have to use up for all the background row/line scrolling that I currently have in my levels (often along with some background priority setting, animated tiles and colour math too).
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by 93143 »

Since this effect is pretty static, you could probably hardcode the frame deltas and unroll the loop (using two instances to allow double buffering of the HDMA table). If the scroll speed is constant and the pattern can be allowed to loop at 256 pixels, I believe you could do a separate scroll value for every line on the bottom half of the screen (112 x-scroll values) in about 4% of the frame (or a bit less than 6% including the actual HDMA).

Code: Select all

; 16-bit A/mem, 8-bit X/Y
; xoff is 8.8 fixed-point, in direct page
; htable is somewhere else, using absolute addressing
	lda #dxoff	; 3 fast
	clc		; 2 fast
	adc xoff	; 2 fast, 2 slow
	sta xoff	; 2 fast, 2 slow
	ldx xoff+1	; 2 fast, 1 slow
	stx htable	; 3 fast, 1 slow
; 14 fast, 6 slow = 132 master cycles
Of course, requiring higher integer values of the x-scroll value, or varying scroll speeds, would complicate this a fair bit. I figure you could still linescroll every single line on the bottom half of the screen in about 7% of the frame (or a little over 8% including the HDMA). I haven't tested this, though, and there might be precision issues with high scroll speeds:

Code: Select all

; 16-bit A/mem, 8-bit X/Y
; xoff is 16.8 fixed-point, in direct page; DP has to change once during the unrolled loop
; $4204/5 contains the raw value of dxoff before scaling, which is constant for the frame
	ldx #scale	; 2 fast
	lda $4214	; 5 fast - access to divider is interleaved to avoid waiting
	stx $4206	; 4 fast
	clc		; 2 fast
	adc xoff	; 2 fast, 2 slow
	sta xoff	; 2 fast, 2 slow
	lda xoff+1	; 2 fast, 2 slow
	bcc +		; 2 or 3 fast
	adc #$00FF	; 3 fast
	sta xoff+1	; 2 fast, 2 slow
+	sta htable	; 3 fast, 2 slow

; 25 or 29 fast, 8 or 10 slow = 214 or 254 master cycles (usually 214)
There's probably a way to trade ROM for better speed when doing this. It might depend on the application. Obviously doing this once every 16 lines is going to be substantially cheaper; the second method would use about half a percent of an NTSC frame.

Something like what DKC2 does is more complicated still, because the camera can move in the vertical axis. I think I'll leave that as an exercise for the reader, at least for now...
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by Oziphantom »

while you could pre calc all the shifted palletes, its probably better to just shuffle them in CGRAM Mirror. As there are multiple palletes to cycle to get different water speeds, you will still have to do the maths/counters to know when you update each one, only then you will also have to shuffle the CGRAM Mirror and DMA it.
User avatar
jeffythedragonslayer
Posts: 344
Joined: Thu Dec 09, 2021 12:29 pm

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by jeffythedragonslayer »

Another way to do these palette swaps without cycling CGRAM is to actually change the palette bits for the animated tiles.
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: What's more processor intensive (if that's the correct terminology) . . .

Post by Oziphantom »

sure, but that is a lot more expensive ;)
Post Reply