Code not working how I'd expect, possibly going mad

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.
User avatar
Bregalad
Posts: 8184
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Code not working how I'd expect, possibly going mad

Post by Bregalad »

Pokun wrote: Thu Mar 13, 2025 2:05 pm OP said he is using the "all in NMI" method which means he is basically only using the NMI service routine (like SMB do).
Backing up the registers is mainly needed if using multiple interrupt service routines which can interrupt each others in the middle of it.

If the NMI routine takes too long and is interrupted by another NMI request before it has finished, it may benefit from backing up registers, but I'm not sure if it can return correctly after that. I think "all in NMI" basically requires that you can guarantee that no interrupt will ever happen before a routine has finished.
TBH, it's very hard to guarantee the NMI thread will never be interrupted by another NMI. Even if you only enable NMIs just before returning, it might still happen here, with unexpected consequence on the stack (it could easily fill up of NMIs interrupting eachother). The only way to make sure an NMI won't interrupt another is to be absolutely your code never exceed one frame of duration, and this is hard. It is much safer to push/pull registers, and detect if an NMI already happened, just like we already discussed on another thread recently.
Useless, lumbering half-wits don't scare us.
Pokun
Posts: 3482
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Code not working how I'd expect, possibly going mad

Post by Pokun »

I see, I guess it's not a problem to return from multiple NMIs after all, as the return points should be stored on stack, I was thinking too hard.
User avatar
tokumaru
Posts: 12673
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Code not working how I'd expect, possibly going mad

Post by tokumaru »

Pokun wrote: Fri Mar 14, 2025 11:36 am I see, I guess it's not a problem to return from multiple NMIs after all, as the return points should be stored on stack, I was thinking too hard.
But if you have too many of them firing over each other, the stack might overflow and the program will crash.
User avatar
Ben Boldt
Posts: 1512
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Code not working how I'd expect, possibly going mad

Post by Ben Boldt »

If you keep track that you are already in an NMI, then immediately exit if you see there is one that isn’t finished, it will create lag in the game, which MAKES time for the NMI to complete. This is desirable to prevent the stack overflowing.

If the lag is only 1 frame, the player won’t likely notice. If the lag is noticeable to the player, it probably did prevent the stack overflowing. So it is generally a good idea to use that technique.

It also greatly limits the scope, which parts of your NMI handler need to be re-entry safe. You are just paying special attention to this flag that keeps track.
User avatar
tokumaru
Posts: 12673
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Code not working how I'd expect, possibly going mad

Post by tokumaru »

As I see it, the simple/straightforward solution to prevent any problems with overlapping NMIs is to have a flag (or a set of flags if you want to selectively disable individual NMI operations) to control whether the NMI handler is allowed to go forward with its operations or not.

Early Nintendo games that used the "all in NMI" approach seem to just disable NMIs altogether while handling each NMI to completely avoid interruptions, but that not only takes away your capacity to selectively skip tasks (flags will allow you to keep audio and raster effects going at 60Hz, for example, while outright disabling NMIs will cause EVERYTHING to lag), but we now also know that mid-screen $2000 writes (required to enable/disable NMIs at the hardware level) can glitch the scroll for an entire scanline - this is very noticeable when playing SMB on real hardware.

Bregalad mentioned the possibility of NMIs firing right between NMIs being enabled (either via $2000 or flags) at the end of the NMI handler and the actual RTI instruction being executed, which indeed can happen, but I find it VERY unlikely that this will happen enough times in a row to cause a stack overflow. The probability of your game loop taking nearly exactly the same amount of time for several consecutive frames, AND that amount of time being nearly exactly the number of cycles in an NES frame is EXTREMELY low. It's probably more likely for your gameplay session to be ruined by lightning striking your NES console in the middle of your living room than by this! Can an NMI fire precisely during that small window of time every once in a while? Yes. Can it happen 80+ times in a row (enough to overflow the stack with return addresses and status flags)? Hell, no!
User avatar
Ben Boldt
Posts: 1512
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Code not working how I'd expect, possibly going mad

Post by Ben Boldt »

tokumaru wrote: Sat Mar 15, 2025 9:50 am As I see it, the simple/straightforward solution to prevent any problems with overlapping NMIs is to have a flag (or a set of flags if you want to selectively disable individual NMI operations) to control whether the NMI handler is allowed to go forward with its operations or not.
You are basically describing a way to separate tasks and giving them different priorities. Directing lag to areas that are less distracting to the player - which is a very smart thing to do. It is not necessarily the simplest but it will give the best result.

You may also have some tasks that don’t need to be run every single frame. You could interleave these tasks, each taking their turn on separate frames.

If you have parts of your code that don’t need to be in the NMI, you can also put that stuff in your main loop, which is automatically the lowest priority.
Fiskbit
Site Admin
Posts: 1400
Joined: Sat Nov 18, 2017 9:15 pm

Re: Code not working how I'd expect, possibly going mad

Post by Fiskbit »

With an all-in-NMI approach, games will usually turn NMIs off at the start and back on at the end. nescoda86 doesn't do this, and so it is possible in theory for an NMI to interrupt the NMI and wreak havoc. In that case, yes, you would need to push registers, but you also need your NMI to check if the previous NMI is done and exit if not; merely saving registers isn't good enough because there is much more state that could be clobbered by the rest of NMI.
Bregalad wrote: Fri Mar 14, 2025 6:24 am TBH, it's very hard to guarantee the NMI thread will never be interrupted by another NMI. Even if you only enable NMIs just before returning, it might still happen here, with unexpected consequence on the stack (it could easily fill up of NMIs interrupting eachother).
I don't believe this is correct. When you turn NMI back on, I expect a 1 instruction delay before an NMI can happen because the write starts in the second half of the last cycle of the instruction, after interrupts have already been polled for that instruction. If it's the last thing you do in NMI, that means RTI will execute and any NMI should occur immediately upon returning.

You'd also want to read $2002 before this, because otherwise you can start the next NMI handler toward the end of vblank without enough time to interact with the PPU. And because writing to $2000 mid-frame can cause glitchy scanlines, you'd also want to enable NMIs using a PPU early write glitch mitigation.
User avatar
tokumaru
Posts: 12673
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Code not working how I'd expect, possibly going mad

Post by tokumaru »

Fiskbit wrote: Sat Mar 15, 2025 2:48 pm With an all-in-NMI approach, games will usually turn NMIs off at the start and back on at the end.
Ever since we discovered that mid-screen $2000 writes at certain times can glitch the horizontal scroll for 1 scanline a few years ago, this probably shouldn't be the recommended solution anymore.

Even if you want to completely skip any NMI handling, it may be better to keep NMIs enabled and use a flag to indicate whether to run the NMI code or not, using logic that looks similar to this:

Code: Select all

NMI:
  
  ;exits immediately if another instance of the NMI handler is running
  bit NMIDone
  bmi :+
  rti
  :
  
  ;indicates that the NMI handler is running
  inc NMIDone ;($FF -> $00)
  
  ;(...)
  
  ;indicates that the NMI handler finished its work
  dec NMIDone ;($00 -> $FF)
  
  ;OBS: The next NMI could actually fire here, and the flag wouldn't indicate that we're still inside the NMI handler since we already changed it, but that's okay as long as it doesn't happen several frames in a row, which is probably less likely to happen than lightning striking your NES console while you're playing!
  
  ;returns from the NMI handler
  rti
There's no need to push/pop registers in this case, because the flag can be tested without affecting A, X or Y.

It's true that you can use one of the mitigation techniques mentioned in the wiki, but isn't it more straightforward to use a solution that relies on easy to understand logic than exploiting obscure hardware quirks to cancel out glitches caused by other obscure hardware glitches? I certainly think so.
Fiskbit
Site Admin
Posts: 1400
Joined: Sat Nov 18, 2017 9:15 pm

Re: Code not working how I'd expect, possibly going mad

Post by Fiskbit »

I agree that turning off NMIs is not a good solution and should not be recommended, but that's because the all-in-NMI approach in general makes it hard to gracefully handle lag frames. Lag frames should not cause sound to lag because of the impact sound has on the user's perception of lag.

However, I don't think the $2000 write bug is a compelling argument against this both because the glitch is so minimal and it's so trivial now to work around it. I think the fact that you need to read $2002 before turning NMIs back on is a much bigger problem because it can result in persistent visual glitches from running out of vblank time (and even potentially a game crash if it breaks a sprite 0 hit that you're relying on; the glitchy scanline is much less likely to do this).

If you leave NMIs on and bail when the previous one hasn't finished, you're starting to have something that looks a lot more like a traditional solution of separate gameplay and NMI threads. Your thread probably starts with handling the PPU and sound; you can keep just this in the NMI, skipping the PPU stuff on lag frames using that flag check you mentioned, and now you have this two-thread structure that is resilient against lag.

However you implement it, all-in-NMI has many downsides in exchange for its apparent simplicity. Most of those are easy to mitigate if you know about them or leave NMIs on, but some are not. I prefer to steer people toward the common structure of using NMI for PPU and sound tasks because, despite having its own pitfalls, it works for most use cases and has good results (e.g. my template here that I recommend to people a lot).
User avatar
Bregalad
Posts: 8184
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Code not working how I'd expect, possibly going mad

Post by Bregalad »

We already had all the discussion about the NMI handling on that thread : viewtopic.php?t=25757

If the OP asks for advice, but doesn't take insight from us, I'm afraid it's not very useful to carry the discussion any further...
Bregalad mentioned the possibility of NMIs firing right between NMIs being enabled (either via $2000 or flags) at the end of the NMI handler and the actual RTI instruction being executed, which indeed can happen, but I find it VERY unlikely that this will happen enough times in a row to cause a stack overflow. The probability of your game loop taking nearly exactly the same amount of time for several consecutive frames, AND that amount of time being nearly exactly the number of cycles in an NES frame is EXTREMELY low. It's probably more likely for your gameplay session to be ruined by lightning striking your NES console in the middle of your living room than by this! Can an NMI fire precisely during that small window of time every once in a while? Yes. Can it happen 80+ times in a row (enough to overflow the stack with return addresses and status flags)? Hell, no!
Lol. I just thought if the frame was doing the same calculations again and again, the situation where it's exactly the "wrong" lenght might just as well repeat. In any cases, if NMI re-entries were handled properly with flags, this is a non-issue.
I agree that turning off NMIs is not a good solution and should not be recommended, but that's because the all-in-NMI approach in general makes it hard to gracefully handle lag frames.
I disagree, it's quite easy to use flags to select what work is done in the NMI to gracefully handle lag frames. My understanding is that it's the "everything in main" paradigm makes it though/impossible, as by design when your calculations aren't finished you're completely ignoring an NMI and as such you're unable to play music or handle raster effect if you don't end the frame in time.
Useless, lumbering half-wits don't scare us.
User avatar
Ben Boldt
Posts: 1512
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Code not working how I'd expect, possibly going mad

Post by Ben Boldt »

I think there are a lot of valid ways to handle it and no perfect solution that works best everywhere. You might build a whole RTOS that can queue tasks and fancy stuff like that, but the overhead running that might (likely) set you further back than a different solution. I am guessing you could have a very rudimentary task queue and things would go well with that. A queued task might include a function pointer, a PRG page, and a priority for example. Using priorities would require searching or sorting the queue at some point, so there is some overhead that may or may not be worth it. I have seen a clever idea, using 2 queues. You only do the second queue when the first queue is empty, so the first queue is higher priority that way without any sorting. When both queues are empty, you know your CPU has become idle. You can toggle an I/O pin (i.e. OUT2) reflecting the idle state in order to measure your CPU usage % using an oscilloscope. There are lots of ways to make things efficient, and maybe that's part of the fun of it if you're a firmware-oriented person.

Also another variable, your mapper might have a hardware timer which could be used with IRQ. You can have all of your tasks associated with software timers running asynchronously to the frame. Not sure I generally like that idea, but it's there. Some people like to build everything around timers; it makes sense in some cases.

Lots of things to think about and it's healthy to keep talking about it even if we tend to repeat ourselves. I have never been known to listen very carefully so I feel like I can benefit from some rehashing every now and then.
Pokun
Posts: 3482
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Code not working how I'd expect, possibly going mad

Post by Pokun »

The discussion as it is now isn't really to answer OP's original question, but unless OP complains and thinks the original topic needs more attention, it doesn't hurt to continue the current discussion.
User avatar
Ben Boldt
Posts: 1512
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Code not working how I'd expect, possibly going mad

Post by Ben Boldt »

Pokun wrote: Sun Mar 16, 2025 5:18 pm The discussion as it is now isn't really to answer OP's original question, but unless OP complains and thinks the original topic needs more attention, it doesn't hurt to continue the current discussion.
Yes, that's a good point we went off topic. It probably isn't the right place for this - sorry about that.
Pokun
Posts: 3482
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Code not working how I'd expect, possibly going mad

Post by Pokun »

I mean it's natural for a discussion to derail into another which can't be helped, and that may even be the only way to talk about certain things which might never start as its own topic. But it's up to OP if he wants to keep the thread more to his original topic.
User avatar
tokumaru
Posts: 12673
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Code not working how I'd expect, possibly going mad

Post by tokumaru »

Bregalad wrote: Sun Mar 16, 2025 3:07 pmMy understanding is that it's the "everything in main" paradigm makes it though/impossible, as by design when your calculations aren't finished you're completely ignoring an NMI and as such you're unable to play music or handle raster effect if you don't end the frame in time.
I agree with that. "Everything in NMI" can be as flexible as the split approach as long as you leave NMIs always enabled. Both of these solutions allow you to pick which tasks to perform and which to skip when an NMI fires, so they should be equally flexible.

"Everything in main" is objectively the worst solution unless your game is guaranteed to never lag. Unfortunately, all but the simplest kinds of games are bound to lag.

Personally, I still prefer to have 2 separate threads, main and NMI. IMO that helps with keeping the game logic neatly separated from hardware details, and the main thread can more freely navigate between different game loops using simple JMPs, instead of relying on jump tables or "game state" variables to select which game loop to run each frame.