Page 1 of 2
DMC interrupts - help please
Posted: Fri Jan 12, 2007 10:24 pm
by tokumaru
Hello all,
My main game project now uses an UOROM board. The thing is that I need quite a bit more time in vblank than usually avaliable.
I though about enabling rendering late, and have a blank bar at the top, but that'd be too complicated, since timing would have to be very constant, and not only this would be hard to do, but I'd have to waste time if by any chance I finished early (not much to copy to VRAM), just to enable rendering at the right time.
Then I remembered we had already discussed this (in a post I started!), and
blargg suggested that I used DMC interrupts combined with sprite 0 detection to disable rendering early instead. And then he offered me some sample code. I know it's been a while but, blargg, if it's ok with you, I'll take that sample code now! =)
Everyone else is welcome to help, of course.
I would like to ask you though if it would be possible to achieve this effect without the sprite 0 hit or the sprite overflow flag. I know that DMC interrupts are hard to time right, but do they at least fire at the same time (if always set at the same time)? If so, I could approximate the spot I want with the IRQ, and then wait for the exact spot with some timed code. Would that be OK?
I ask because a sprite hit would be too hard for me to set up, as this game is a 8-way scroller, and to make sure a hit would take place I'd have to place a garbage tile at the bottom to make sure the sprite would hit something. Just too damn messy in my opinion.
Also, could I try something crazy, like setting the timer once to disable rendering early and then again to enable it later? This would be a good way to spread the blank area so that it does not look too bad.
Any suggestions? Thanks for the help!
Posted: Sat Jan 13, 2007 8:15 am
by Bregalad
Your problem is quite interesting. After all enabling the VBlank later isn't that hard, but just a little more tricky I think.
If you upload a lot of stuff on the screen, and be sure the thing your uploads are constant, then go for it. Just have some buffers of what to write here and be sure to empty them every frame (so just write the stuff to the PPU regardless of if this is needed or not). So now the NMI will always trigger at the same place (with 21 PPU cycles of error at most), and the screen will always be enabled at the same place later, after your sprite DMA, palette, name, attribute and VRAM update. You'll just have to be more carefull when writing scroll values than usual.
The main problem of this technique is that if the frame is longer than the actual fame, the buffers could be read in NMI during the time they are written to in the main code, resulting in possible very weird unpredictible PPU updates. That is bad, so you'll need two buffer of each, and a flag that sets wich one is valid, that you toggle everytime your buffers are finished to be filled.
If you want the thing to be on the bottom, it'd be definitely more conforteble for coding the NMI code itself (more flexible buffers, code taking a variable lenght, normal scroll update via $2005 only), but it is harder to trigger the thing.
If you go for DMC interrupts, the first thing you have to do in your NMI routine is to start a sample with a lenght that is a bit less than one frame, and try different values to get the one you want. After that, your IRQ routine will serve as an NMI routine, but the NMI routine should still be enabled so that the timing keeps right, and to trigger the next IRQ. So the IRQ should upload the additionnal PPU buffers that cannot be uploaded during regular VBlank, and then exits slightly right before the actual NMI triggers.
The main problem of this technique is that it will be hard to port from one video standard to the other (or maybe you can just have the PAL version use normal NMI mode, since it's VBlank is about 3.5 times longer). Another problem is that the DMC sample can be triggered in the middle of a scanline or just have not enough precision to get you the value you want (even if you must get quite some possibilities with 16 sppeds and lenght with 16 byte precision).
Then, what Blargg says is that you would want to prefer have a regular sprite-zero hit (that happen on a very known place of the screen), and to avoid have your CPU wasting all it's time, just trigger an IRQ approximativly before the sprite zero hit, and wait for the hit inside the IRQ routine. This will be easy to port from one video standard to the other, and you can get great precision. Also, this has you to put a normal sprite zero hit, the only thing IRQ does is spare the CPU for a stupid long wait, by putting a short one instead.[/u]
Posted: Sat Jan 13, 2007 8:17 am
by Dwedit
I've seen how Battletoads works, it always does long VRAM transfers constantly, whether it needs it or not. That way, not only does it transfer lots of data to VRAM when it needs it, it also runs in constant time.
Posted: Sat Jan 13, 2007 8:57 am
by tokumaru
Bregalad wrote:Just have some buffers of what to write here and be sure to empty them every frame (so just write the stuff to the PPU regardless of if this is needed or not).
I though I could do it like that, but some writes will cause glitches if done when not needed. For example, the game srolls vertically and horizontally at the same time. Say it scrolled down a bit, when a row of metatiles was written to the PPU. Say that after that, the player only moves horizontally, requiring that only columns on metatiles are written. After a while, the name tables should all be filled with new data (for having scrolled horizontally so much), and if I kep writing that old row of metatiles I'd be putting old parts of the map on the screen...
This is the hard part of turning rendering on late: Every frame there are things I may or may not do, so getting the timing to be constant is a pretty tough (and messy!) task. Also, when a task is not to be performed, I'd just have to kill as much time as the task would take, throwing processing time down the toilet. Since my game engine is quite complex, I don't think I should waste cycles like that.
The main problem of this technique is that if the frame is longer than the actual fame, the buffers could be read in NMI during the time they are written to in the main code, resulting in possible very weird unpredictible PPU updates. That is bad, so you'll need two buffer of each, and a flag that sets wich one is valid, that you toggle everytime your buffers are finished to be filled.
I don't buffer twice. Just a flag does the job. I use a "frame is ready" flag, and the vblank code only updates PPU data if this flag is set. And of course, it is only set after all buffers have valid data. I couldn't possibly replicate my buffers, I'm almost running out of RAM as it is! =)
If you go for DMC interrupts, the first thing you have to do in your NMI routine is to start a sample with a lenght that is a bit less than one frame, and try different values to get the one you want. After that, your IRQ routine will serve as an NMI routine
Yeah, that's what I want to do. Thing is that I'm a really crappy sound programmer, so yesterday
I played like hell with registers $4010-$4013 but couldn't get a single IRQ to fire. If anyone can tell me how to do that correctly I'd really appreciate.
but the NMI routine should still be enabled so that the timing keeps right, and to trigger the next IRQ. So the IRQ should upload the additionnal PPU buffers that cannot be uploaded during regular VBlank, and then exits slightly right before the actual NMI triggers.
That's prety much what I intended to do.
The main problem of this technique is that it will be hard to port from one video standard to the other (or maybe you can just have the PAL version use normal NMI mode, since it's VBlank is about 3.5 times longer).
Sure, PAL vblank is more than enough to do what I need!
Then, what Blargg says is that you would want to prefer have a regular sprite-zero hit (that happen on a very known place of the screen), and to avoid have your CPU wasting all it's time, just trigger an IRQ approximativly before the sprite zero hit, and wait for the hit inside the IRQ routine.
Yeah, normally this would be fine, but I don't want my game to have a garbage tile by the bottom of the screen (like
Guardian Legend or one of the
Big Nose games) to make sure a hit will take place.
That's why I was wondering it the DMC IRQ would fire always at the same time if I set it up at the start of the NMI routine, for example. If so, I wouldn't mind using some small timed code to wait for the proper time to disable rendering.
I wouldn't mind this not being compatible with PAL, since as you said, a PAL version could just use the regular NMI. Maybe the software itself could check for a PAL system and not use DMC interrupts at all.
If anyone can tell me how to set up DMC interrups, I'd be really grateful!
Posted: Sat Jan 13, 2007 9:03 am
by blargg
The reason the DMC interrupt isn't useful for precise timing is that you can't reset the timer, only set a new period that will be used once the current cycle finishes. The minimum period is 432 CPU cycles between potential interrupts, so the interrupt might occur a few scanlines later sometimes. There might be some way to use a combination of rates and continual interrupts to get it locked on precisely (sort of how my saw wave technique works).
On top of all this, the DMC rates don't give you a lot of options as to where the interrupt would occur mid-screen, given that you start it at the top of the screen. How many scanlines before VBL do you want to disable rendering?
EDIT: I'll write and post some hardware-tested DMC interrupt code in a few hours.
Posted: Sat Jan 13, 2007 9:21 am
by tokumaru
blargg wrote:The reason the DMC interrupt isn't useful for precise timing is that you can't reset the timer, only set a new period that will be used once the current cycle finishes. The minimum period is 432 CPU cycles between potential interrupts, so the interrupt might occur a few scanlines later sometimes. There might be some way to use a combination of rates and continual interrupts to get it locked on precisely (sort of how my saw wave technique works).
Even if I use more sample bytes and a higher rate? I will read what you wrote above a couple more times to make sure I understand what you're saying there.
On top of all this, the DMC rates don't give you a lot of options as to where the interrupt would occur mid-screen, given that you start it at the top of the screen.
But doesn't it take the same ammount of time to play the same ammount of samples?
How many scanlines before VBL do you want to disable rendering?
16 or 24 scanlines. I am
thinking about spreading those a bit at the top and a bit at the bottom, so it doesn't look weird. If it is possible to set up DMC interrupts like that, I'd like to try this:
1. Inside the NMI routine, prepare an interrupt for 33 (20 + 1 + 12) scanlines later, then do some PPU tranfers (that don't take longer than 33 scanlines);
2. When the IRQ fires, prepare the next IRQ for 12 scanlines before vblank, then set the scroll correctly and enable rendering;
3. When the IRQ fires again (12 scanlines before the NMI), disable rendering and copy more data to the PPU, before the NMI fires.
Well, something along those lines. I wouldn't mind using some timed code after the IRQ's fire to make sure my PPU updates are scanline-aligned. I'm just having a hard time figuring how to set up a DMC IRQ. I wrote some code yesterday where at the start of the NMI routine I tried many many combinations of data into $4010-$4013 but couldn't fire a single IRQ. Of course I was a bit frustrated.
EDIT: I'll write and post some hardware-tested DMC interrupt code in a few hours.
Thanks very much for your time! =D
Posted: Sat Jan 13, 2007 9:52 am
by Bregalad
I have a great idea that would work for approxmage IRQs and to get a sprite zero hit on the bottom of the field. To do it, I think you can do a tile with a horizontal black line with background priority. Just make sure no BG tile have more than 4 consecutive transparent pixels on it horizontally, and you're done.
Then, on NMI, start by firing a DPCM sample timed to go aproximatively above the place where the sprite zero hit is supposed to be. Do the second part of your variable PPU transfers here, and update the scrolling like normal. Enable the PPU (sprites), but keep BG disabled. Wait the sprite zero hit flag to be clear, then wait some constant hardcored time, and then turn background on. If this causes sprite glitches, I think you have to put sprites 1-9 to fixed use (just dummy sprites) on the top of the screen to hide all other sprites. I don't think this flag polling if the PPU is entierly disabled.
Then exit your NMI and continue like normal. When your IRQ triggers, wait for sprite zero hit to occur (and I'd have some watchdog system in case of it wouldn't occur after a few scanline), then turn off the screen, do the first part of your PPU transfers for next frame, and exit the IRQ, effectivly continuing the frame. It normally shouldn't last long until the next NMI fires, but even if the frame isn't completed, nothing will screw up at all as long as the IRQ does not overlap the NMI.
Posted: Sat Jan 13, 2007 10:06 am
by blargg
Here's the code, for ca65 (my NES devcart setup wasn't working, so it's only tested in a semi-accurate emulator). As Bregalad described, the NES book reader from a while back used this technique with a sprite #0 hit to make the timing precise.
mid_frame_dmc_irq.zip
I'm wondering why you need an interrupt in the first place. Does your main code sometimes take longer than a frame, preventing you from simply polling sprite #0 hit when you're done? If you are going to be doing frame processing from the IRQ, there's no need to break it between that and NMI; you'd use the NMI simply to trigger the DMC interrupt.
You can try different sample lengths and rates in the sample code. Using a sample of one byte doesn't work at all since the IRQ will occur immediately, as the DMC will load that first byte the moment you enable it. The next shortest is 17 bytes, and the next after that is 33 bytes. 17 bytes allows rates $6 to $F to give useful positions on screen. 33 bytes only allows the top few to work. You could have multiple DMC IRQs per frame, but the variability of when they occur will make that less useful.
Posted: Sat Jan 13, 2007 10:08 am
by dvdmth
A DMC IRQ, as I recall, is rather imprecise and carries a margin of error too great for PPU-related timing. You can use the IRQ, but unless you follow it up with a sprite 0 hit, the bottom of the screen will show some flickering due to the large margin of error.
Is there any reason why you cannot port your code to MMC3?
Posted: Sat Jan 13, 2007 10:27 am
by tokumaru
blargg wrote:Here's the code, for ca65
Thank you very much! I'll play with this NOW! =D
I'm wondering why you need an interrupt in the first place. Does your main code sometimes take longer than a frame, preventing you from simply polling sprite #0 hit when you're done?
Of course this shouldn't happen all the time, or the game would be damn slow, but I expect it to take longer than a frame sometimes. It's a pretty complex platform engine with fas scrolling, so there's lots of level map decoding (that is quite fast actually) and there may be a lot of objects on screen at times (and this is the real time killer!).
And I have to see how i'll go about polling sprite 0 hit, as I can't make sure a hit will happen unless I place a garbage (solid) tile at the spot of the hit. Since this game scrolls in all directions, it's very possible that a sprite doesn't collide with anything solid. And I
need empty (color 0) tiles on the background.
If you are going to be doing frame processing from the IRQ, there's no need to break it between that and NMI; you'd use the NMI simply to trigger the DMC interrupt.
Yes, but I would still need constant-timed NMI code if I expected to turn rendering on late at the same spot every time. Let's see how that goes.
You can try different sample lengths and rates in the sample code. Using a sample of one byte doesn't work at all since the IRQ will occur immediately, as the DMC will load that first byte the moment you enable it. The next shortest is 17 bytes, and the next after that is 33 bytes. 17 bytes allows rates $6 to $F to give useful positions on screen. 33 bytes only allows the top few to work. You could have multiple DMC IRQs per frame, but the variability of when they occur will make that less useful.
I'm playing with it right now! Thank you very much again! =)
dvdmth wrote:Is there any reason why you cannot port your code to MMC3?
Yup, a couple:
1. I don't have an MMC3 devcart - I plan to make one soon however, but I'm using CHR-RAM for this project, and changing one of my MMC3 carts to use that might be a little too much for me right now, since I can't even get a RAM chip without destroying another cart;
2. I make heavy use of 8x16 sprites that use both sides of the pattern table, and that breaks the scanline counter, meaning that the switch to MMC3 is useless since I wouldn't be able to use the IRQ.
I'll play with the code blargg posted, and then I'm sure i'll make my mind on what to do next. Thanks for the help so far guys!
Posted: Sat Jan 13, 2007 10:36 am
by Bregalad
Hey, there is something that will allow you a sprite 0 hit to occur with almost no glitches, while this sounds very very hard to do, and I've heard Wizard and Warriors 3 does this but I haven't vertified it or anything.
Many games have a single tile on the bottom of the screen they dinamicly change to get a solid tile to get sprite zero hit. This looks ugly and I understand you want to avoid that. However, if and only if you use CHRRAM, you can point that tile to a special tile in the pattern table that you'll be able to varing constantly (since you have a lot of VBlank time, a part of it will be used to get more VBlank time heheh), and you can put a slighly modified version of the tile that would be supposed to be here. So, when this tile in question (while scrolling) changes, you'll have to rewrite the whole tile. However, when you're just changing vertical scroll position, you can made it so that it's ORed with some value so that matches where the sprite zero is placed, getting only 8 pixels at most to go weird.
And if you don't want to do that, of course you can use background color 0, but just never use more than 8 consecutive pixel of it, or if you do it just make sure this never reaches the bottom of the screen.
Posted: Sat Jan 13, 2007 10:53 am
by tokumaru
Bregalad wrote:Many games have a single tile on the bottom of the screen they dinamicly change to get a solid tile to get sprite zero hit. This looks ugly and I understand you want to avoid that.
Yeah...!
However, if and only if you use CHRRAM, you can point that tile to a special tile in the pattern table that you'll be able to varing constantly
Now there is a great idea! I can just make a copy of the tile that would go there, modifying it to make sure a hit happens. The good thing is that craeting this copy could be done out of vblank, so there'd be no vblank time lost for this. I'll give it a try!
And blargg, thanks for the code, it's working great. I can see now how the time the IRQ fires can vary. I wonder why is that, since it would make sence that the same audio would play using the same ammount of time. But I know squat about sound, so maybe precise timing isn't so important.
BTW, I had set up some code with almost the same structure as yours, but since I didn't know how to work with DMC the thing never worked. I guess what I was really missing was this:
I didn't know you had to do that.

Posted: Sat Jan 13, 2007 11:03 am
by Bregalad
I though I could do it like that, but some writes will cause glitches if done when not needed. For example, the game srolls vertically and horizontally at the same time. Say it scrolled down a bit, when a row of metatiles was written to the PPU. Say that after that, the player only moves horizontally, requiring that only columns on metatiles are written. After a while, the name tables should all be filled with new data (for having scrolled horizontally so much), and if I kep writing that old row of metatiles I'd be putting old parts of the map on the screen...
On a last ressort, I'm pretty much sure that this can be bypassed. This will only give you black on the top of the screen, but if you write a whole row of tiles and a whole column of tiles a time, and have the SAME buffer holds rows and columns (only updating one at a time), then I think it should be okay.
Posted: Sat Jan 13, 2007 11:28 am
by blargg
tokumaru wrote:I can see now how the time the IRQ fires can vary. I wonder why is that, since it would make sence that the same audio would play using the same ammount of time. But I know squat about sound, so maybe precise timing isn't so important.
Think of it like in this hypothetical example: you have an NMI routine that decrements a counter on each interrupt, then does something special if the counter reached zero. If you reset the counter to 5 and enable NMI ($2000=$80) at various random times, the counter will reach zero anywhere from slightly over 4 NMI periods (if you enable just before VBL) to almost 5 NMI periods (if you enable after VBL ends), depending on where in the frame you enabled NMI. It's the same with the DMC, where it's
always running the timer internally. That's why the code sets the rate to maximum ($F) between interrupts, to reduce this variability to a minimum (hmmm, I tried varying this rate and some made really stable timings, but that might just be due to chance... experiment with the timings, even things that seem crazy).
Posted: Sat Jan 13, 2007 12:13 pm
by tokumaru
I guess I can understand that. Although we enable the IRQ almost at the same time every frame (when NMI fires), there is also the internal working of the DMC timer, and we never know the current state of that. I think I got it, thanks!
blargg wrote:That's why the code sets the rate to maximum ($F) between interrupts, to reduce this variability to a minimum
I was wondering just that! Now I know what that was for.
experiment with the timings, even things that seem crazy).
I'll try that too, thanks.