Page 1 of 2
animation scheme for next game
Posted: Tue Jun 14, 2016 5:19 pm
by psycopathicteen
Today I just thought of a new animation scheme that fixes most of the issues I've had with previous techniques.
Use 8x8 and 16x16 sprites. Each VRAM slot be the size of 2 16x16 sprites. Metasprites can use as many slots as they want.
Re: animation scheme for next game
Posted: Tue Jun 14, 2016 5:31 pm
by Drew Sebastino
Isn't this a whole lot more limited? I guess it takes less processing time though.
Re: animation scheme for next game
Posted: Tue Jun 14, 2016 8:49 pm
by psycopathicteen
A couple reasons:
1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
2) 8x8 allow me to experiment more with skeletal animation and sprite shearing effects.
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 5:13 am
by lint
Each VRAM slot be the size of 2 16x16 sprites
I don't see how it fixes :
1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
Can you elaborate a bit ? So i can get it.
Anyway a sprite engine on snes will almost always depend on the game design. If you do too generic code, you will almost always have problems with v-blank.
I'm currently porting Kung Fu Master (Arcade) to Snes. And when i see how i have to handle sprite, it's very specific and can't be done with something generic (dma transfer and vram size allowed for sprites).
Any way i would like to know more about your new scheme.
++ Lint
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 6:18 am
by psycopathicteen
lint wrote:Each VRAM slot be the size of 2 16x16 sprites
I don't see how it fixes :
1) uploading a lot of individual 16x16 causes a lot of DMA overhead. My game is barely keeping everything within vblank.
Can you elaborate a bit ? So i can get it.
++ Lint
It takes more time to set up DMA registers twice as often for the same amount of data.
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 6:28 am
by lint
Ok so your gain is just dividing dma setup for sprite by 2... it's really what is slowing down your game ?
Can you explain what you consider a VRAM slot ? and why 2 x 16x16 sprites ?
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 10:24 am
by psycopathicteen
It won't slow down my game, it just causes the top few scanlines to be blacked out.
DMA bandwidth can be calculated like this:
DMA bandwidth = "total data size" + "CPU setup time" * "number of chunks."
By doing bigger chunks at once, there will be less chunks per total data size, and take up less bandwidth and less likely to cause black scanlines on the top of the screen.
A slot is just a designated vram location, that is usually updated in one chunk.
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 4:06 pm
by Drew Sebastino
I don't get it, how is uploading only 32x16 any more beneficial than uploading 32x32's and 16x16's, unless all you're uploading is 16x16's?
I thought this code I made was fine:
Code: Select all
tile_uploader:
rep #$30 ; A=16, X/Y=16
ldx #$0000
lda #$1801 ; Set DMA mode (word, normal increment) and destination register (VRAM write register)
sta $4300
lda #$0080
sta $2115
tile_uploader_16x16:
lda TileRequestCounter16x16
beq tile_uploader_32x32_start
lda #$0040
sta $4305
sta $4315
;16x16 Top Half
lda TileRequestCounter16x16+VramAdress,x
sta $2116
lda TileRequestCounter16x16+BankNumber,x
sta $4303
sta $4313
lda TileRequestCounter16x16+TileAddress,x
sta $4302
clc
adc #$0040
sta $4312
lda #%0000000100000000 ; Initiate DMA transfer (channel 0)
sta $420A
;16x16 Bottom Half
lda TileRequestCounter16x16+VramAdress,x
clc
adc #$0100
sta $2116
lda #%0000001000000000 ; Initiate DMA transfer (channel 1)
sta $420A
txa
clc
adc #$0006
tax
cpx TileRequestCounter16x16
bne tile_uploader_16x16
tile_uploader_32x32_start:
lda TileRequestCounter32x32
beq tile_uploader_done
lda #$0080
sta $4305
sta $4315
sta $4325
sta $4335
tile_uploader_32x32:
lda TileRequestCounter32x32+VramAdress,x
sta $2116
lda TileRequestCounter32x32+BankNumber,x
sta $4303
sta $4313
sta $4323
sta $4333
lda TileRequestCounter32x32+TileAddress,x
sta $4302
clc
adc #$0040
sta $4312
clc
adc #$0040
sta $4322
clc
adc #$0040
sta $4332
lda #%0000000100000000 ; Initiate DMA transfer (channel 0)
sta $420A
;Second Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0100
sta $2116
lda #%0000001000000000 ; Initiate DMA transfer (channel 1)
sta $420A
;Third Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0200
sta $2116
lda #%0000010000000000 ; Initiate DMA transfer (channel 2)
sta $420A
;Fourth Row
lda TileRequestCounter32x32+VramAdress,x
clc
adc #$0300
sta $2116
lda #%0000100000000000 ; Initiate DMA transfer (channel 3)
sta $420A
txa
clc
adc #$0006
cmp TileRequestCounter32x32
beq tile_uploader_done
tax
bra tile_uploader_32x32
tile_uploader_done:
rts
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 5:12 pm
by tepples
I think "32x16" is supposed to refer to two 16x16s that are always consecutive
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 7:16 pm
by Drew Sebastino
I mean, I know that, I just don't know what kind of advantage you're getting.
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 7:28 pm
by tepples
The top or bottom half of a 16x16 is 64 bytes, which can be copied in the equivalent of 86 fast cycles. But it takes about 36 cycles just to set up the registers for one copy. If you make longer copies, you can set up the registers fewer times.
Re: animation scheme for next game
Posted: Wed Jun 15, 2016 8:02 pm
by Nicole
Basically, since each slot is 32x16 instead of 16x16, instead of
Code: Select all
set up dma to transfer one 16x16 tile
transfer 16x16 tile
set up dma to transfer one 16x16 tile
transfer 16x16 tile
it's
Code: Select all
set up dma to transfer two 16x16 tiles
transfer 16x16 tile
transfer 16x16 tile
So you only have to set up DMA once per pair.
Re: animation scheme for next game
Posted: Thu Jun 16, 2016 10:58 am
by Drew Sebastino
Yeah, but one 32x32 is...
Code: Select all
set up dma to transfer four 16x16 tiles
transfer 16x16 tile
transfer 16x16 tile
transfer 16x16 tile
transfer 16x16 tile
Of course, you will also be doing
Code: Select all
set up dma to transfer one 16x16 tile
transfer 16x16 tile
Which will about balance it out, so I have no clue how the new system is any more beneficial.
Re: animation scheme for next game
Posted: Thu Jun 16, 2016 11:08 am
by tepples
A 16x16 has two transfers: a 2-tile top half (64 bytes) and a 2-tile bottom half (64 bytes). A 32x16 also has two transfers: a 4-tile top half (128 bytes) and a 4-tile bottom half (128 bytes). Yet it transfers more data. This halves register setup overhead, which helps when register setup takes the same time as 24 bytes. Halving the number of transfers will more than halve the time spent on register setup during vblank, as it allows the register setups during active picture to cover more bytes. Of course, the ideal situation for vblank time is to transfer a set of eight 16x16s at once, as that adds up to a single 1024-byte transfer. But that also adds more work during active picture, as the data must be copied to a transfer buffer, and if you're using HDMA for OPT rocking or a mode 7 floor, you can't use DMA copies to WRAM without crashing a 1/1/1 console.
Re: animation scheme for next game
Posted: Thu Jun 16, 2016 5:29 pm
by Nicole
Oh, right, I forgot that there's a gap between the top and bottom halves of a 16x16 tile in VRAM. So it's more like:
Code: Select all
; transfer two 16x16 tiles individually
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer two 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
; transfer 32x16 slot
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
; transfer 32x32 slot
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
set up dma to transfer four 8x8 tiles
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
transfer 8x8 tile
In other words, you don't really gain anything from having 32x32 slots, since it takes just as much time as two 32x16 slots, while being more wasteful of space.