I started with the sprite rendering function. (Next thing will be the PPU update, then reading the current level data.)
So, could you please have a look at the following sprite rendering function and tell me if there's something that I could improve to make it faster?
Some information:
All meta sprites are stored in the same array. The function takes the array index and then puts the starting address to a pointer.
The meta sprites are declared like this:
Width, height, Y offset.
Tile, palette, tile, palette, tile, palette...
So, all meta sprites are drawn in a rectangle shape, so I don't have to read the X and Y offset for each tile. Makes the function much faster.
As global variables, we have:
Absolute X: The value in the center of the meta sprite (not the leftmost position).
Absolute Y: The value at the bottom of the meta sprite (not the top position).
The meta sprites array index where the data are read from.
The next PPU sprites index where the data is written to
The mirror attribute to check whether the meta sprite should be flipped.
Each X and Y position is two bytes since characters who leave the screen on one side shall not enter on the other side. (X and Y offset values that are relative to the actual sprite's position are of course just one byte.)
Three remarks:
1. Characters in the game can be assigned to more than one palette. (For example, my main character has: Palette 1 = skin color, hair color, t-shirt color. Palette 2 = skin color, pants color, shoe color.) So, I cannot save the palette for the whole meta sprite. I have to use one value per tile.
2. Yes, I know: I set all sprite values first and then I check for the question whether I should actually render them. Instead of skipping the code as soon as one coordinate is outside the screen.
I did this because of the following reason:
If the current game situation is one where less than the maximum number of characters are on screen, then these characters will have an IsActive variable set to false in the game logic code. I.e. if there are only two characters on the screen while the game can handle five at once, UpdateSprites will be called only two times anyway.
This means, said optimization will only work in the rare cases where a character is partly on screen and partly offscreen.
But if a character is rendered, the game engine has to be able to handle him anyway, so there's no need to add some more comparisons just because we could save some cycles in the one second where he's partly outside the screen.
If a character is on-screen, he will be fully visible for 99% of the time, so additional BEQs for a 1 % case where only parts of him are visible would actually make the code slower since most of the time, the stuff cannot be skipped anyway.
3. Checking for mirroring whenever a new X value is set is actually faster than saving a mirror bit mask and a subtraction value in the beginning and then using that for calculation. At least when the characters are only two tiles wide, which is the case with almost all of my characters.
Alright, that's my code:
Code: Select all
.segment "ZEROPAGE"
_UpdateSpritesSpritesIndex: .res 1
.export _UpdateSpritesSpritesIndex
_UpdateSpritesMetaSpritesIndex: .res 2
.export _UpdateSpritesMetaSpritesIndex
_UpdateSpritesX: .res 2
.export _UpdateSpritesX
_UpdateSpritesY: .res 2
.export _UpdateSpritesY
_UpdateSpritesMirrorAttributes: .res 1
.export _UpdateSpritesMirrorAttributes
XCounter: .res 1
YCounter: .res 1
HalfWidth: .res 1
HeightInTiles: .res 1
AbsoluteX: .res 2
AbsoluteY: .res 2
RelativeX: .res 2
PossiblyMirroredRelativeX: .res 2
.segment "CODE"
_UpdateSprites_:
.export _UpdateSprites_
; The start position from the meta sprites array
; for the current sprites is set to the const pointer.
CLC
LDA #<(_MetaSprites)
ADC _UpdateSpritesMetaSpritesIndex
STA _ConstPointer
LDA #>(_MetaSprites)
ADC _UpdateSpritesMetaSpritesIndex + 1
STA _ConstPointer + 1
; The index offset of the meta sprites array,
; starting at the position of the const pointer.
LDY #$00
; The size of the current meta sprites,
; counted in tiles, not in pixels.
; That's the counter value for the X loop,
; i.e. the outer loop.
LDA (_ConstPointer), Y
INY
STA XCounter
; XCounter * 4 = Half of the width of the meta sprite.
ASL
ASL
STA HalfWidth
; The absolute X position is in the center of the meta sprite.
; The relative X position gets moved from the center to the left,
; so that this value points to the leftmost position of the meta sprite.
SEC
LDA #$00
SBC HalfWidth
STA RelativeX
LDA #$00
SBC #$00
STA RelativeX + 1
; Height, counted in tiles.
LDA (_ConstPointer), Y
INY
STA HeightInTiles
; The absolute Y position is at the bottom of the meta sprite.
; So, it is moved eight pixels to the top,
; so that the tiles' bottoms are actually at the desired position.
SEC
LDA _UpdateSpritesY
SBC #$08
STA _UpdateSpritesY
LDA _UpdateSpritesY + 1
SBC #$00
STA _UpdateSpritesY + 1
; Some characters cannot be drawn with their feet in the bottom position.
; For these meta sprites, the offset value is added to the Y position,
; so that they're still in the correct position.
CLC
LDA _UpdateSpritesY
ADC (_ConstPointer), Y
INY
STA _UpdateSpritesY
LDA _UpdateSpritesY + 1
ADC #$00
STA _UpdateSpritesY + 1
; The index of the PPU sprites that are written next.
LDX _UpdateSpritesSpritesIndex
; The outer loop: All rows are drawn from left to right.
@loopX:
; The height in tiles becomes the loop counter.
LDA HeightInTiles
STA YCounter
; The absolute Y value is set to its starting position.
LDA _UpdateSpritesY
STA AbsoluteY
LDA _UpdateSpritesY + 1
STA AbsoluteY + 1
; If the meta sprite shall be mirrored,
; we have to manipulate the X position.
LDA _UpdateSpritesMirrorAttributes
BEQ @noMirroring
; The relative X position gets inverted and subtracted with 7.
; This way, it has the correct value to render the tile
; at the opposite of the meta sprite's center.
; The new value is stored in a separate variable.
SEC
LDA RelativeX
EOR #%11111111
SBC #$07
STA PossiblyMirroredRelativeX
LDA RelativeX + 1
EOR #%11111111
SBC #$00
STA PossiblyMirroredRelativeX + 1
JMP @endMirroring
@noMirroring:
; If no mirroring is done,
; the value is simply copied into the new variable.
LDA RelativeX
STA PossiblyMirroredRelativeX
LDA RelativeX + 1
STA PossiblyMirroredRelativeX + 1
@endMirroring:
; We take the original absolute centered X position
; and add the relative X position to it.
; This way we get the actual value
; that needs to be used for the rendering.
CLC
LDA _UpdateSpritesX
ADC PossiblyMirroredRelativeX
STA AbsoluteX
LDA _UpdateSpritesX + 1
ADC PossiblyMirroredRelativeX + 1
STA AbsoluteX + 1
; The inner loop: Every tile in this column is rendered from bottom to top.
@loopY:
; The low byte of the Y position is written to the sprites array.
LDA AbsoluteY
STA _Sprites + 0, X
; The tile is read from the meta sprites array
; and set to the sprites array.
LDA (_ConstPointer), Y
INY
STA _Sprites + 1, X
; The attributes are read from the meta sprites array.
; They are OR-connected with the mirror attributes
; and then written to the sprites array.
LDA (_ConstPointer), Y
INY
ORA _UpdateSpritesMirrorAttributes
STA _Sprites + 2, X
; The low byte of the X position is written to the sprites array.
LDA AbsoluteX
STA _Sprites + 3, X
; If the high byte of X or Y is not 0,
; this means this specific sprite is outside the screen.
; In this case, the rendering is skipped.
; It doesn't matter that the values in the sprites array are already written.
; As long as _UpdateSpritesSpritesIndex isn't incremented,
; the _ClearSprites function will make sure
; that all unused sprites are put outside the screen in the end.
LDA AbsoluteX + 1
BNE @endRendering
LDA AbsoluteY + 1
BNE @endRendering
; If everything is alright, then _UpdateSpritesSpritesIndex and the X register
; get incremented with the value 4.
; This value corresponds to the four bytes that we have written to the sprites array.
; The PPU will render the current sprite on the screen.
INX
INX
INX
INX
STX _UpdateSpritesSpritesIndex
@endRendering:
; If the Y counter is 0,
; the inner loop isn't repeated anymore
; and all of the loop preparation is skipped.
DEC YCounter
BEQ @noLoopY
; For the next loop,
; the Y position is decremented with 8,
; i.e. one tile height.
SEC
LDA AbsoluteY
SBC #$08
STA AbsoluteY
LDA AbsoluteY + 1
SBC #$00
STA AbsoluteY + 1
; The inner loop is repeated.
JMP @loopY
@noLoopY:
; If the X counter is 0, the function ends.
; Otherwise, the outer loop is repeated.
DEC XCounter
BEQ @noLoopX
; For the next loop,
; the X position is incremented with 8,
; i.e. one tile width.
CLC
LDA RelativeX
ADC #$08
STA RelativeX
LDA RelativeX + 1
ADC #$00
STA RelativeX + 1
; The outer loop is repeated.
JMP @loopX
@noLoopX:
RTS