Fun with MMC3 Scanline IRQs

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

lidnariq
Posts: 10677
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Fun with MMC3 Scanline IRQs

Post by lidnariq »

darryl.revok wrote:It's looking pretty tight in NMI, so I'm looking for any options available to save cycles.
Since you have an MMC3, an obvious solution is use an IRQ to turn off rendering early. An extra 8 scanlines there will get you a little more than an additional 1000 cycles in your vblank handler, and won't be visible on the vast majority of NTSC televisions.

Note that if you start relying on end-of-frame timing, you might want to be careful to not break Dendy famiclone timing.
User avatar
darryl.revok
Posts: 520
Joined: Sat Jul 25, 2015 1:22 pm

Re: Fun with MMC3 Scanline IRQs

Post by darryl.revok »

I managed to squeeze out a few more cycles. I'm down to 2265 max currently.

One thing I know absolutely nothing about is audio. How many cycles will I need in NMI for a music and sound player?

I'd still like to squeeze in palette effects if possible so I'm looking for more optimizations but I think I may be out of easy options.

I started writing a routine to create tile update code in RAM, but it would be so cumbersome with varying rates of tile fetching, that I'm not going to bother.
tokumaru wrote:If you have any loops at all, you should really look into unrolling them.
I did this previously with my tile updating, and it especially helped a lot when everything was sequential. I took this approach, and in a relatively extreme example, I took this:

Code: Select all

;----------------Check for new background tiles-----------------------------------------------------
 
  LDX ScanlineSplitNum

DrawColumnsAttributesLoop:  
  
  LDA newTileFlag, x
  BEQ NoNewTiles
  BPL DontDrawNewAttributes

  LDA AttributeDrawPointersLo, x
  STA ColumnDrawJumpPointerLo
  LDA AttributeDrawPointersHi, x
  STA ColumnDrawJumpPointerHi

  LDY nametableColumnAttributesLo, x
  
  JMP (ColumnDrawJumpPointerLo)  
;  JSR DrawNewAttributes

AttributeDrawReturn:

  LDA newTileFlag, x
  AND #%00000001
  BEQ NoNewTiles
  
;----------------------------------------------------------------------------------------------------------------
DontDrawNewAttributes:
  
  LDA ColumnDrawPointersLo, x
  STA ColumnDrawJumpPointerLo
  LDA ColumnDrawPointersHi, x
  STA ColumnDrawJumpPointerHi
  
  LDY nametableColumnLo, x
  
  JMP (ColumnDrawJumpPointerLo)
;  JSR DrawNewColumns

ColumnDrawReturn:
  
;------------------------------------------------------------------------------------
  
NoNewTiles:

  LDA #$00
  STA newTileFlag, x
  
  DEX
  BPL DrawColumnsAttributesLoop
And wrote it out 8 times, replacing ",x" with +1/+2/+3/etc.

It saved enough cycles to make things work but I don't know if it's enough to add a music player and palette swaps.
I just noticed you didn't mention $2005... was that just an omission or did you not know that these registers share the even/odd write flag?
Actually, no I didn't realize that. I just kind of assumed that $2006 was the one affecting that since it's a 16-bit address. So I'm guessing if the hi/low latch is off, then writes to $2005 will be reversed?
If you're writing attributes for columns though, increments of 32 bytes can still be useful
That's a good tip. I didn't think of that. It won't work with my scroll splits but it would save a little in my basic scrolling NMI.
rainwarrior wrote:If you're not desperate to save 4 cycles, reading it once at the start of your NMI might be worthwhile just in case it corrects some edge case you missed.
Seemed to help for me. Hopefully eventually I figure out why it was happening, but at least that glitch won't hold me up for now.
thefox wrote:If you do read it at start of NMI, be careful to not have things like bulk PPU uploads running in the main thread that could be affected by it.
Hmmm.. Right now the only thing outside of NMI that should be doing so is drawing the initial nametable for a level, but I haven't seen any problems with it. NMI should be disabled during that procedure. I'll be sure to keep that in mind though.
lidnariq wrote:An extra 8 scanlines there will get you a little more than an additional 1000 cycles in your vblank handler
That is a really cool idea! Has anyone used that technique to cram in massive graphic updates? That would be really good for CHR-RAM, I'd imagine.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Fun with MMC3 Scanline IRQs

Post by tokumaru »

darryl.revok wrote:How many cycles will I need in NMI for a music and sound player?
During the important part of the NMI (i.e. vblank), none. Normally you call the audio update routine after you're done with all the PPU stuff.
Actually, no I didn't realize that. I just kind of assumed that $2006 was the one affecting that since it's a 16-bit address.
The latch selects between high/low address bytes for $2006, and X/Y scroll for $2005.
So I'm guessing if the hi/low latch is off, then writes to $2005 will be reversed?
Yes. An odd number of writes to $2006 will cause the next $2005 write to affect the vertical scroll, instead of the horizontal. In fact, this is an important part of the $2006/5/5/6 trick, because the two $2005 writes must be Y first, then X.
Has anyone used that technique to cram in massive graphic updates? That would be really good for CHR-RAM, I'd imagine.
Yes, some games do that. Off the top of my head I can think of Big Nose Freaks Out, and maybe Solstice. They use other methods for timing though, since they don't have IRQs. Jurassic Park uses IRQs to blank 8 scanlines at the top of the screen and 8 scanlines at the bottom, not by disabling rendering, but by bankswitching black patterns. This is for hiding scrolling glitches, not for the extra blanking time. Battletoads does a lot of CHR-RAM updates, but it blanks the top of the screen, not the bottom.

EDIT: Just thought of another one that does exactly what lidnariq described: Somari (or any of the variations of that game). The vblank handler in that game must be very poorly optimized if they need that extra time for VRAM transfers. It uses CHR-ROM, so there's no need for massive VRAM updates.
User avatar
darryl.revok
Posts: 520
Joined: Sat Jul 25, 2015 1:22 pm

Re: Fun with MMC3 Scanline IRQs

Post by darryl.revok »

So I managed to find plenty more cycles in my NMI. It's currently running at a max of 1800 cycles, so that should be plenty of room for anything I might want to add to this scene.
I took the loop unrolling posted earlier one step further. I realized that since I have an unrolled loop, not only do I have not have to pull pointers from an array to build a JMP (indirect), but I avoided JMPing altogether by moving the PPU update routines into the routines which call them.
tokumaru wrote:Jurassic Park uses IRQs to blank 8 scanlines at the top of the screen and 8 scanlines at the bottom, not by disabling rendering, but by bankswitching black patterns.
This seems like a better approach than Alfred Chicken using up a large portion of it's sprites to cover the glitches in vertical mirroring mode.
so there's no need for massive VRAM updates.
Is it possible that Sonic runs faster than 7 pixels per second in this version, requiring double the tile updates? If so, scrolling in both directions could be 124 tiles. Just a guess.

If anybody should know what it takes to make a Sonic port on NES, it would be you. :) Is your NES Sonic available to play? To be honest I haven't played too many homebrews except for Battle Kid. I didn't know about most of them that are out there until I came here. I did play the demo for Lizard a bit. It was cool, I liked it. It had a very atmospheric, metroid like, exploration vibe. The music was good too.
User avatar
darryl.revok
Posts: 520
Joined: Sat Jul 25, 2015 1:22 pm

Re: Fun with MMC3 Scanline IRQs

Post by darryl.revok »

tokumaru wrote:Back in the day, developers didn't really know all these crazy tricks we know today. The $2006/5/5/6 trick for example, certainly wasn't documented anywhere
I was thinking about this, and it's strange that Nintendo didn't even know this. Would that classify this function as "heavy wizardry"? https://en.wikipedia.org/wiki/Magic_(pr ... )#Variants
User avatar
dougeff
Posts: 2875
Joined: Fri May 08, 2015 7:17 pm
Location: DIGDUG
Contact:

Re: Fun with MMC3 Scanline IRQs

Post by dougeff »

I could be wrong...but I think 'wizardry' or 'magic' is like when you download some speciallized library to perform some task that you don't understand how to do, and maybe almost nobody knows how to do it...but include X library and magically turn Y into a Z file.

The $2006/5/5/6 trick would apply, if one guy at Nintendo figured it out and sent it to the game makers as a way to do certain things...and the programmers had no idea how or why it worked (because the hardware wasn't fully documented, at the time)
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Fun with MMC3 Scanline IRQs

Post by tokumaru »

darryl.revok wrote:So I managed to find plenty more cycles in my NMI. It's currently running at a max of 1800 cycles, so that should be plenty of room for anything I might want to add to this scene.
Cool.
This seems like a better approach than Alfred Chicken using up a large portion of it's sprites to cover the glitches in vertical mirroring mode.
Well, I think both are valid solutions. Personally, I would never want to reduce the sprites-per-scanline limit to 7, but the approach used by Alfred Chicken is simpler to implement when you don't have a scanline counter available. Interestingly enough, Felix the Cat also uses this technique, even though it does have a scanline counter. In my own engine, which uses a simple discrete logic mapper, I chose to hide 16 scanlines at the top of the screen, timed from the end of vblank.
Is it possible that Sonic runs faster than 7 pixels per second in this version, requiring double the tile updates? If so, scrolling in both directions could be 124 tiles. Just a guess.
That shouldn't be a problem... My own engine always updates 16 pixels worth of new background data anyway, and 128 bytes (17 * 4 + 15 * 4) are necessary to update both a row and a column of blocks, not counting attributes, and it all fits under regular vblank time just fine. Granted, when this happens, nothing else is updated (besides sprites), but it's impossible for the player to move in a perfect diagonal at 16 pixels per frame for several frames in a row, so there will be time to update other things, even at ridiculous speeds.
If anybody should know what it takes to make a Sonic port on NES, it would be you.
I know that Somari is quite crappy, and I don't even have to be a good programmer to know that! :lol: It has its merits though... It shows that the NES isn't so far behind the Genesis like most people think, making it clear that the basic idea is possible. It just needed a lot more polish in order to be a good game... the physics is the worst part, but the music and the graphics could definitely be improved a lot.
Is your NES Sonic available to play?
Nah, I coded a solid scrolling engine, but never got to implement any physics/gameplay. hopefully that will change soon.
I did play the demo for Lizard a bit. It was cool, I liked it. It had a very atmospheric, metroid like, exploration vibe. The music was good too.
Indeed. Recently I've grown fonder of the Metroidvania style of gameplay.
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Fun with MMC3 Scanline IRQs

Post by tepples »

tokumaru wrote:it's impossible for the player to move in a perfect diagonal at 16 pixels per frame for several frames in a row
Let me guess: no Chemical Plant Zone Act 2.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Fun with MMC3 Scanline IRQs

Post by tokumaru »

AFAIK, the camera in Sonic 1 and 2 is limited to 16 pixels of movement in each axis (same as my engine), which would explain why it can't follow Sonic in CPZ. It seems Sonic 3 (&K) can handle more, but I'm not sure that happens much.

Anyway, I never planned to copy the physics of any existing games. I'll create everything from scratch and tweak until it feels good to me, so I'm not sure it will be possible to move faster than the camera.
User avatar
darryl.revok
Posts: 520
Joined: Sat Jul 25, 2015 1:22 pm

Re: Fun with MMC3 Scanline IRQs

Post by darryl.revok »

thefox wrote:Blargg's nmi_sync is an extreme (and impressive) example of this.
In working on my compo entry, I didn't want to rule out the possibility of using raster effects, so I started looking into options available without a scanline IRQ. Well, timing code isn't too tough if your NMI is a fixed length, but that's not always practical. I looked into nmi_sync a little, then I had an idea. Why not set a wait at the beginning of logic for sprite 0 to clear, which would begin your logic in the pre-render scanline each time. Then pick a part of your logic that's easy to make a static length, such as scrolling, and run that at the beginning for a status bar. Then, you can have the sprite 0 hit available later in the frame. You have to be sure that it always hits though, or your frame will never run.

I've never heard about anybody doing this but I figure it's probably not that uncommon. Has anyone else thought to time off the sprite zero clear?
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Fun with MMC3 Scanline IRQs

Post by tokumaru »

darryl.revok wrote:Has anyone else thought to time off the sprite zero clear?
Until recently I was waiting for the sprite overflow flag to be cleared in order to detect the end of vblank, but now I'm taking a break from raster effects.
User avatar
darryl.revok
Posts: 520
Joined: Sat Jul 25, 2015 1:22 pm

Re: Fun with MMC3 Scanline IRQs

Post by darryl.revok »

tokumaru wrote:
darryl.revok wrote:...but now I'm taking a break from raster effects.
They can be quite the rabbit hole, can't they? I think I spent about two months coding on my game before starting on raster effects, and most of my time coding since them has been on them. I've got more game in the new entry I started on last weekend than I do on my main game.

In any case I can't deny that I've learned from any coding I've done so a challenge can be nice, but sometimes quite misdirecting. I've disabled any scanline tricks in my entry currently, because any changes to the code mean adjusting wait loops. No need to do that again until the very end.
Post Reply