I managed to squeeze out a few more cycles. I'm down to 2265 max currently.
One thing I know absolutely nothing about is audio. How many cycles will I need in NMI for a music and sound player?
I'd still like to squeeze in palette effects if possible so I'm looking for more optimizations but I think I may be out of easy options.
I started writing a routine to create tile update code in RAM, but it would be so cumbersome with varying rates of tile fetching, that I'm not going to bother.
tokumaru wrote:If you have any loops at all, you should really look into unrolling them.
I did this previously with my tile updating, and it especially helped a lot when everything was sequential. I took this approach, and in a relatively extreme example, I took this:
Code: Select all
;----------------Check for new background tiles-----------------------------------------------------
LDX ScanlineSplitNum
DrawColumnsAttributesLoop:
LDA newTileFlag, x
BEQ NoNewTiles
BPL DontDrawNewAttributes
LDA AttributeDrawPointersLo, x
STA ColumnDrawJumpPointerLo
LDA AttributeDrawPointersHi, x
STA ColumnDrawJumpPointerHi
LDY nametableColumnAttributesLo, x
JMP (ColumnDrawJumpPointerLo)
; JSR DrawNewAttributes
AttributeDrawReturn:
LDA newTileFlag, x
AND #%00000001
BEQ NoNewTiles
;----------------------------------------------------------------------------------------------------------------
DontDrawNewAttributes:
LDA ColumnDrawPointersLo, x
STA ColumnDrawJumpPointerLo
LDA ColumnDrawPointersHi, x
STA ColumnDrawJumpPointerHi
LDY nametableColumnLo, x
JMP (ColumnDrawJumpPointerLo)
; JSR DrawNewColumns
ColumnDrawReturn:
;------------------------------------------------------------------------------------
NoNewTiles:
LDA #$00
STA newTileFlag, x
DEX
BPL DrawColumnsAttributesLoop
And wrote it out 8 times, replacing ",x" with +1/+2/+3/etc.
It saved enough cycles to make things work but I don't know if it's enough to add a music player and palette swaps.
I just noticed you didn't mention $2005... was that just an omission or did you not know that these registers share the even/odd write flag?
Actually, no I didn't realize that. I just kind of assumed that $2006 was the one affecting that since it's a 16-bit address. So I'm guessing if the hi/low latch is off, then writes to $2005 will be reversed?
If you're writing attributes for columns though, increments of 32 bytes can still be useful
That's a good tip. I didn't think of that. It won't work with my scroll splits but it would save a little in my basic scrolling NMI.
rainwarrior wrote:If you're not desperate to save 4 cycles, reading it once at the start of your NMI might be worthwhile just in case it corrects some edge case you missed.
Seemed to help for me. Hopefully eventually I figure out why it was happening, but at least that glitch won't hold me up for now.
thefox wrote:If you do read it at start of NMI, be careful to not have things like bulk PPU uploads running in the main thread that could be affected by it.
Hmmm.. Right now the only thing outside of NMI that should be doing so is drawing the initial nametable for a level, but I haven't seen any problems with it. NMI should be disabled during that procedure. I'll be sure to keep that in mind though.
lidnariq wrote:An extra 8 scanlines there will get you a little more than an additional 1000 cycles in your vblank handler
That is a really cool idea! Has anyone used that technique to cram in massive graphic updates? That would be really good for CHR-RAM, I'd imagine.