What's more processor intensive (if that's the correct terminology) . . .
Moderator: Moderators
Forum rules
- For making cartridges of your Super NES games, see Reproduction.
What's more processor intensive (if that's the correct terminology) . . .
Actually scrolling part of a background layer or cycling the palette to make it look like it's scrolling (imagine it's a horizontal stream of water flowing across the screen about two thirds of the way down the view)?
Also, imagine I'm doing this multiple times in the view, so changing the scroll speed of the water say every 8-16 scanlines for a parallax effect or having multiple parts of the scene with a few of the palettes cycling for waterfalls, streams, etc.
For a simple example of what I'm talking about: https://youtu.be/r1crAucc1oc?t=777
Also, imagine I'm doing this multiple times in the view, so changing the scroll speed of the water say every 8-16 scanlines for a parallax effect or having multiple parts of the scene with a few of the palettes cycling for waterfalls, streams, etc.
For a simple example of what I'm talking about: https://youtu.be/r1crAucc1oc?t=777
Last edited by iNCEPTIONAL on Tue Jun 07, 2022 4:22 pm, edited 1 time in total.
Re: What's more processor intensive (if that's the correct terminology) . . .
Cycling the palette can be done by having 2 copies of your cycling palette and DMA at a different point each frame.
I'd say they're both about 100 cycles per frame (~0.2%).
I'd say they're both about 100 cycles per frame (~0.2%).
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: What's more processor intensive (if that's the correct terminology) . . .
changing the H scroll with a HDMA is trivial and takes "nothing" to update.
- jeffythedragonslayer
- Posts: 344
- Joined: Thu Dec 09, 2021 12:29 pm
Re: What's more processor intensive (if that's the correct terminology) . . .
"Processor intensive" is fine terminology.
"both" is referring to each of the cycling palettes, not both techniques correct?
Re: What's more processor intensive (if that's the correct terminology) . . .
OK, so both methods are pretty un-taxing, but [lets say every 8 pixel row] scrolling at different speeds on a background is maybe a tiny bit les taxing than cycling a few palettes?
The reason I ask is because I'm currently designing quite a bit of line/row scrolling in each of the backgrounds in my game and I just want to make sure it might not be better to do some of the scrolling via palette swaps instead (like on the water layer) just to keep and tax on the SNES' processor to the minimum and avoid any potential slowdown.
If you look at the ice level you can see how much line/row scrolling I'm doing on each of the layers, and this is before I do any code for the player, enemies, bullets and whatever other stuff, so just making sure I'm all good: https://youtu.be/ytNUcBYcEuI
The reason I ask is because I'm currently designing quite a bit of line/row scrolling in each of the backgrounds in my game and I just want to make sure it might not be better to do some of the scrolling via palette swaps instead (like on the water layer) just to keep and tax on the SNES' processor to the minimum and avoid any potential slowdown.
If you look at the ice level you can see how much line/row scrolling I'm doing on each of the layers, and this is before I do any code for the player, enemies, bullets and whatever other stuff, so just making sure I'm all good: https://youtu.be/ytNUcBYcEuI
Re: What's more processor intensive (if that's the correct terminology) . . .
But he said every 16 lines (and gave Dead Moon video as an example). 224 / 16 = 14 updates. 100 / 14 updates = 7 cycles. It's definitely more than 7 cycles per update. You need to add an offset to each line to get a scroll speed (usually a fixed point value), store it back in the table. And there's the cost of HDMA itself when it runs for that line. And that assume that there is no "relative" scrolling vertically.. assuming the vertical scrolling is fixed. Because if it's not, then you have to offset (calculate) the hDMA table to accommodate it (which line it's called for), which also costs more cycles (which Dead moon does do). I'm not saying it's breaking the bank, but it's more than 100 cycles. Probably ~2% if you're being conservative with the estimation, cpu resource for the frame.
Re: What's more processor intensive (if that's the correct terminology) . . .
Linescroll via palette swap would be quite limiting, particularly in Mode 0. You'd be stuck with snapping to whatever granularity you can map 3 colours to (meaning there are only 3 possible scroll positions), and you'd need to reserve a palette for each scroll speed. It might work okay with 4bpp graphics (particularly on a PC Engine where there are 16 BG and 16 sprite palettes instead of 8 and 8 like on SNES), but you're using 2bpp so...
Linescroll via HDMA is far more flexible and still very cheap. I don't think this is something where you should really be worried about slowdown. It's nothing compared to running AI and collisions.
For waterfalls and such, you could in principle use vertical scroll with windowing, but the non-silly options are probably palette cycling and tile animation, with the latter being more expensive because it takes a nontrivial amount of DMA. It's still only about 0.3% of a frame per updated 2bpp tile, but since you can't DMA during active display, the relevant number is actually ~2% of VBlank per 2bpp tile.
Linescroll via HDMA is far more flexible and still very cheap. I don't think this is something where you should really be worried about slowdown. It's nothing compared to running AI and collisions.
For waterfalls and such, you could in principle use vertical scroll with windowing, but the non-silly options are probably palette cycling and tile animation, with the latter being more expensive because it takes a nontrivial amount of DMA. It's still only about 0.3% of a frame per updated 2bpp tile, but since you can't DMA during active display, the relevant number is actually ~2% of VBlank per 2bpp tile.
Re: What's more processor intensive (if that's the correct terminology) . . .
Well, it's not that I worry the row/line scrolling would cause slowdown in and of itself but more that I'm doing more of it than I see in most SNES games, so I just want to make sure I'm not at risk of hogging up too much of the processor resources doing a bunch of row/line scrolling on each of the four backgrounds only to find it causes a problem when I then get to all the stuff I want to do with sprites and the like.93143 wrote: ↑Tue Jun 07, 2022 4:21 pm Linescroll via palette swap would be quite limiting, particularly in Mode 0. You'd be stuck with snapping to whatever granularity you can map 3 colours to (meaning there are only 3 possible scroll positions), and you'd need to reserve a palette for each scroll speed. It might work okay with 4bpp graphics (particularly on a PC Engine where there are 16 BG and 16 sprite palettes instead of 8 and 8 like on SNES), but you're using 2bpp so...
Linescroll via HDMA is far more flexible and still very cheap. I don't think this is something where you should really be worried about slowdown. It's nothing compared to running AI and collisions.
For waterfalls and such, you could in principle use vertical scroll with windowing, but the non-silly options are probably palette cycling and tile animation, with the latter being more expensive because it takes a nontrivial amount of DMA. It's still only about 0.3% of a frame per updated 2bpp tile, but since you can't DMA during active display, the relevant number is actually ~2% of VBlank per 2bpp tile.
It's just hard for me to get a sense of how much of the available processor resources I'd have to use up for all the background row/line scrolling that I currently have in my levels (often along with some background priority setting, animated tiles and colour math too).
Re: What's more processor intensive (if that's the correct terminology) . . .
Since this effect is pretty static, you could probably hardcode the frame deltas and unroll the loop (using two instances to allow double buffering of the HDMA table). If the scroll speed is constant and the pattern can be allowed to loop at 256 pixels, I believe you could do a separate scroll value for every line on the bottom half of the screen (112 x-scroll values) in about 4% of the frame (or a bit less than 6% including the actual HDMA).
Of course, requiring higher integer values of the x-scroll value, or varying scroll speeds, would complicate this a fair bit. I figure you could still linescroll every single line on the bottom half of the screen in about 7% of the frame (or a little over 8% including the HDMA). I haven't tested this, though, and there might be precision issues with high scroll speeds:
There's probably a way to trade ROM for better speed when doing this. It might depend on the application. Obviously doing this once every 16 lines is going to be substantially cheaper; the second method would use about half a percent of an NTSC frame.
Something like what DKC2 does is more complicated still, because the camera can move in the vertical axis. I think I'll leave that as an exercise for the reader, at least for now...
Code: Select all
; 16-bit A/mem, 8-bit X/Y
; xoff is 8.8 fixed-point, in direct page
; htable is somewhere else, using absolute addressing
lda #dxoff ; 3 fast
clc ; 2 fast
adc xoff ; 2 fast, 2 slow
sta xoff ; 2 fast, 2 slow
ldx xoff+1 ; 2 fast, 1 slow
stx htable ; 3 fast, 1 slow
; 14 fast, 6 slow = 132 master cycles
Code: Select all
; 16-bit A/mem, 8-bit X/Y
; xoff is 16.8 fixed-point, in direct page; DP has to change once during the unrolled loop
; $4204/5 contains the raw value of dxoff before scaling, which is constant for the frame
ldx #scale ; 2 fast
lda $4214 ; 5 fast - access to divider is interleaved to avoid waiting
stx $4206 ; 4 fast
clc ; 2 fast
adc xoff ; 2 fast, 2 slow
sta xoff ; 2 fast, 2 slow
lda xoff+1 ; 2 fast, 2 slow
bcc + ; 2 or 3 fast
adc #$00FF ; 3 fast
sta xoff+1 ; 2 fast, 2 slow
+ sta htable ; 3 fast, 2 slow
; 25 or 29 fast, 8 or 10 slow = 214 or 254 master cycles (usually 214)
Something like what DKC2 does is more complicated still, because the camera can move in the vertical axis. I think I'll leave that as an exercise for the reader, at least for now...
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: What's more processor intensive (if that's the correct terminology) . . .
while you could pre calc all the shifted palletes, its probably better to just shuffle them in CGRAM Mirror. As there are multiple palletes to cycle to get different water speeds, you will still have to do the maths/counters to know when you update each one, only then you will also have to shuffle the CGRAM Mirror and DMA it.
- jeffythedragonslayer
- Posts: 344
- Joined: Thu Dec 09, 2021 12:29 pm
Re: What's more processor intensive (if that's the correct terminology) . . .
Another way to do these palette swaps without cycling CGRAM is to actually change the palette bits for the animated tiles.
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: What's more processor intensive (if that's the correct terminology) . . .
sure, but that is a lot more expensive