animation scheme for next game
Moderator: Moderators
Forum rules
- For making cartridges of your Super NES games, see Reproduction.
-
psycopathicteen
- Posts: 3001
- Joined: Wed May 19, 2010 6:12 pm
animation scheme for next game
Today I just thought of a new animation scheme that fixes most of the issues I've had with previous techniques.
Use 8x8 and 16x16 sprites. Each VRAM slot be the size of 2 16x16 sprites. Metasprites can use as many slots as they want.
Use 8x8 and 16x16 sprites. Each VRAM slot be the size of 2 16x16 sprites. Metasprites can use as many slots as they want.
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: animation scheme for next game
Isn't this a whole lot more limited? I guess it takes less processing time though.
-
psycopathicteen
- Posts: 3001
- Joined: Wed May 19, 2010 6:12 pm
Re: animation scheme for next game
A couple reasons:
1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
2) 8x8 allow me to experiment more with skeletal animation and sprite shearing effects.
1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
2) 8x8 allow me to experiment more with skeletal animation and sprite shearing effects.
Re: animation scheme for next game
I don't see how it fixes :Each VRAM slot be the size of 2 16x16 sprites
Can you elaborate a bit ? So i can get it.1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
Anyway a sprite engine on snes will almost always depend on the game design. If you do too generic code, you will almost always have problems with v-blank.
I'm currently porting Kung Fu Master (Arcade) to Snes. And when i see how i have to handle sprite, it's very specific and can't be done with something generic (dma transfer and vram size allowed for sprites).
Any way i would like to know more about your new scheme.
++ Lint
https://twitter.com/Lint_
http://snesdev.antihero.org/ [depecated blog, new one coming one of these days]
http://snesdev.antihero.org/ [depecated blog, new one coming one of these days]
-
psycopathicteen
- Posts: 3001
- Joined: Wed May 19, 2010 6:12 pm
Re: animation scheme for next game
It takes more time to set up DMA registers twice as often for the same amount of data.lint wrote:I don't see how it fixes :Each VRAM slot be the size of 2 16x16 sprites
Can you elaborate a bit ? So i can get it.1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
++ Lint
Re: animation scheme for next game
Ok so your gain is just dividing dma setup for sprite by 2... it's really what is slowing down your game ?
Can you explain what you consider a VRAM slot ? and why 2 x 16x16 sprites ?
Can you explain what you consider a VRAM slot ? and why 2 x 16x16 sprites ?
https://twitter.com/Lint_
http://snesdev.antihero.org/ [depecated blog, new one coming one of these days]
http://snesdev.antihero.org/ [depecated blog, new one coming one of these days]
-
psycopathicteen
- Posts: 3001
- Joined: Wed May 19, 2010 6:12 pm
Re: animation scheme for next game
It won't slow down my game, it just causes the top few scanlines to be blacked out.
DMA bandwidth can be calculated like this:
DMA bandwidth = "total data size" + "CPU setup time" * "number of chunks."
By doing bigger chunks at once, there will be less chunks per total data size, and take up less bandwidth and less likely to cause black scanlines on the top of the screen.
A slot is just a designated vram location, that is usually updated in one chunk.
DMA bandwidth can be calculated like this:
DMA bandwidth = "total data size" + "CPU setup time" * "number of chunks."
By doing bigger chunks at once, there will be less chunks per total data size, and take up less bandwidth and less likely to cause black scanlines on the top of the screen.
A slot is just a designated vram location, that is usually updated in one chunk.
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: animation scheme for next game
I don't get it, how is uploading only 32x16 any more beneficial than uploading 32x32's and 16x16's, unless all you're uploading is 16x16's?
I thought this code I made was fine:
I thought this code I made was fine:
Code: Select all
tile_uploader:
rep #$30 ; A=16, X/Y=16
ldx #$0000
lda #$1801 ; Set DMA mode (word, normal increment) and destination register (VRAM write register)
sta $4300
lda #$0080
sta $2115
tile_uploader_16x16:
lda TileRequestCounter16x16
beq tile_uploader_32x32_start
lda #$0040
sta $4305
sta $4315
;16x16 Top Half
lda TileRequestCounter16x16+VramAdress,x
sta $2116
lda TileRequestCounter16x16+BankNumber,x
sta $4303
sta $4313
lda TileRequestCounter16x16+TileAddress,x
sta $4302
clc
adc #$0040
sta $4312
lda #%0000000100000000 ; Initiate DMA transfer (channel 0)
sta $420A
;16x16 Bottom Half
lda TileRequestCounter16x16+VramAdress,x
clc
adc #$0100
sta $2116
lda #%0000001000000000 ; Initiate DMA transfer (channel 1)
sta $420A
txa
clc
adc #$0006
tax
cpx TileRequestCounter16x16
bne tile_uploader_16x16
tile_uploader_32x32_start:
lda TileRequestCounter32x32
beq tile_uploader_done
lda #$0080
sta $4305
sta $4315
sta $4325
sta $4335
tile_uploader_32x32:
lda TileRequestCounter32x32+VramAdress,x
sta $2116
lda TileRequestCounter32x32+BankNumber,x
sta $4303
sta $4313
sta $4323
sta $4333
lda TileRequestCounter32x32+TileAddress,x
sta $4302
clc
adc #$0040
sta $4312
clc
adc #$0040
sta $4322
clc
adc #$0040
sta $4332
lda #%0000000100000000 ; Initiate DMA transfer (channel 0)
sta $420A
;Second Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0100
sta $2116
lda #%0000001000000000 ; Initiate DMA transfer (channel 1)
sta $420A
;Third Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0200
sta $2116
lda #%0000010000000000 ; Initiate DMA transfer (channel 2)
sta $420A
;Fourth Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0300
sta $2116
lda #%0000100000000000 ; Initiate DMA transfer (channel 3)
sta $420A
txa
clc
adc #$0006
cmp TileRequestCounter32x32
beq tile_uploader_done
tax
bra tile_uploader_32x32
tile_uploader_done:
rtsRe: animation scheme for next game
I think "32x16" is supposed to refer to two 16x16s that are always consecutive
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: animation scheme for next game
I mean, I know that, I just don't know what kind of advantage you're getting.
Re: animation scheme for next game
The top or bottom half of a 16x16 is 64 bytes, which can be copied in the equivalent of 86 fast cycles. But it takes about 36 cycles just to set up the registers for one copy. If you make longer copies, you can set up the registers fewer times.
Re: animation scheme for next game
Basically, since each slot is 32x16 instead of 16x16, instead of
it's
So you only have to set up DMA once per pair.
Code: Select all
set up dma to transfer one 16x16 tile
transfer 16x16 tile
set up dma to transfer one 16x16 tile
transfer 16x16 tile
Code: Select all
set up dma to transfer two 16x16 tiles
transfer 16x16 tile
transfer 16x16 tile
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: animation scheme for next game
Yeah, but one 32x32 is...
Of course, you will also be doing
Which will about balance it out, so I have no clue how the new system is any more beneficial.
Code: Select all
set up dma to transfer four 16x16 tiles
transfer 16x16 tile
transfer 16x16 tile
transfer 16x16 tile
transfer 16x16 tileCode: Select all
set up dma to transfer one 16x16 tile
transfer 16x16 tileRe: animation scheme for next game
A 16x16 has two transfers: a 2-tile top half (64 bytes) and a 2-tile bottom half (64 bytes). A 32x16 also has two transfers: a 4-tile top half (128 bytes) and a 4-tile bottom half (128 bytes). Yet it transfers more data. This halves register setup overhead, which helps when register setup takes the same time as 24 bytes. Halving the number of transfers will more than halve the time spent on register setup during vblank, as it allows the register setups during active picture to cover more bytes. Of course, the ideal situation for vblank time is to transfer a set of eight 16x16s at once, as that adds up to a single 1024-byte transfer. But that also adds more work during active picture, as the data must be copied to a transfer buffer, and if you're using HDMA for OPT rocking or a mode 7 floor, you can't use DMA copies to WRAM without crashing a 1/1/1 console.
Re: animation scheme for next game
Oh, right, I forgot that there's a gap between the top and bottom halves of a 16x16 tile in VRAM. So it's more like:
In other words, you don't really gain anything from having 32x32 slots, since it takes just as much time as two 32x16 slots, while being more wasteful of space.
Code: Select all
; transfer two 16x16 tiles individually
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
; transfer 32x16 slot
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
; transfer 32x32 slot
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile