HBlank, IRQ and Sprite 0 hit

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

Wave
Posts: 110
Joined: Mon Nov 16, 2009 5:59 am

Post by Wave »

Dwedit wrote:You could also read PPUSTAT instead of a second PPUSCROLL write.
Oh, I'll do that, thanks :)
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

tokumaru wrote:
Dwedit wrote:You could also read PPUSTAT instead of a second PPUSCROLL write.
True. If you have a free register (the one you just used to write the X scroll being a good candidate) that would be the best choice.
And with a BIT of thought, even that's not strictly necessary ;-)
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

True again! I guess I'm a BIT slow today... I haven't eaten all day, so I'll go eat something now and see if I get better.
Wave
Posts: 110
Joined: Mon Nov 16, 2009 5:59 am

Post by Wave »

Oooh, that seems a BIT better :)

How do commercial games do the 2006-2005-2005-2006 trick?
I can't find a way to do it in 26 cycles, in MMC3 how many cycles do you have in an IRQ?
User avatar
Dwedit
Posts: 4470
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

Quoting one of my older posts...

To do a scroll to an arbitrary position during rendering...
* Write anything with the correct name table bits to 2006. All other bits are overwritten later. (name table bits are ....xx..)
* Write the Y coordinate to 2005. Three lowest bits of Coarse Y are overwritten later.
* Write anything with the correct fine X to 2005. All bits of Coarse X are overwritten later.
* Write Coarse X (X >> 3) | Coarse Y (Y and #$38, << 2) to 2006.

So, there's 4 4-cycle writes, and 4 3-cycle loads from zeropage, unless you can stick the code into RAM where you would use immediate loads instead.
Only the final 2 writes affect scrolling, so make sure those happen in rapid succession, as in don't use the same register for the ST_ instruction.

Unless you need to do it every scanline, I would just burn cycles until the end of the scanline before doing IRQ response code. That would include the other stuff like getting ready for the next IRQ, and other stuff like that.

Code: Select all


"N" = nametable, "X" = coarse X, "Y" = coarse Y, "y" = fine y, "d" = written data
           Fine Y
              Nametable Y,X
                Coarse Y
                     Coarse X
         
          .yyyNNYYYYYXXXXX

2000 write:
        t:....NN..........=d:......xx
2005 first write:
        t:...........XXXXX=d:xxxxx...
        x=d:.....xxx  (fine X)
2005 second write:
        t:......YYYYY.....=d:xxxxx...
        t:.yyy............=d:.....xxx
2006 first write:
        t:..yyNNYY........=d:..xxxxxx
        t:.y..............=0
2006 second write:
        t:........YYYXXXXX=d:xxxxxxxx
        v=t
scanline start (if background and sprites are enabled):
        v:.....N.....XXXXX=t:.....N.....XXXXX
frame start (line 0, ppu clock 304) (if background and sprites are enabled):
        v=t 
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Wave
Posts: 110
Joined: Mon Nov 16, 2009 5:59 am

Post by Wave »

Yeah, I know what I have to do to make it work, the problem is to make it work in time.
4*4 + 4*3 = 28
And you have to save registers previously. Has anyone implemented it to work on an MMC3 IRQ? Or does it produce glitches?
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Wave wrote:How do commercial games do the 2006-2005-2005-2006 trick?
I can't think of a game that needed fine Y scrolling when changing the scroll mid-frame, as most did it just for status bars.
I can't find a way to do it in 26 cycles
The way I do it takes more time than that, and I honestly believe it doesn't get much faster than that unless you use a huge table. You can of course pre-calculate the values and buffer them, so that when the IRQ fires they are all ready to be used.
in MMC3 how many cycles do you have in an IRQ?
According to Disch, MMC3 IRQs will fire either at cycle 260 (normal settings) or cycle 324 (alternate settings). Both are inside HBlank already, and considering the overhead of entering the IRQ routine itself there isn't much time left to do anything. You might just want to set up the IRQ for the previous scanline, and kill the time until the next HBlank by calculating the scroll values.
Wave wrote:Has anyone implemented it to work on an MMC3 IRQ? Or does it produce glitches?
I'm sure it will work fine if you have the IRQ fire 1 scanline earlier. When modifying the scroll, only the last 2 writes (to $2005 and $2006) need to be inside HBlank, that's 8 cycles out of 28, so there's plenty of room to position those writes so that there are no glitches at all.
Wave
Posts: 110
Joined: Mon Nov 16, 2009 5:59 am

Post by Wave »

Cool, I'll do it that way then, IRQs delay it's effect 1 scanline, it wastes a bit of cycles but you can prepare everything.
User avatar
Bregalad
Posts: 8036
Joined: Fri Nov 12, 2004 2:49 pm
Location: Caen, France

Post by Bregalad »

Hi, the list of all effects that can "technically" be done is here html version and txt version.
I know this is far too complex, I expand way too much on some aspect and make some stuff sounds overcomplicated when the original goal was the exact opposite but oh well.

This doesn't mention much in what those effect can be useful though. I'm pretty sure I mention a few effect that no existing games or demo ever did.

I already thought about making a "general purpose" library for modifying PPU registers in real time, but in the end there is too many different application for this to be useful, it's better to write a specific routine every time you need it. Especially because different registers should be written at different times to remove glitches.

I don't know if this answers the original post (I didn't read the others) :oops: [/url]
Useless, lumbering half-wits don't scare us.
User avatar
Dwedit
Posts: 4470
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

If you want to scroll faster using MMC3 interrupts, you can do the first two writes (2006,2005) early, as the first 2006 write only affects X nametable selection, and has no other effect on scrolling.
Then you can do the 2005,2006 writes later, which affect visible X and Y scrolling.

So your interrupt fires at ppu pixel #260, and it takes 7 CPU cycles to respond to an interrupt. Maybe the CPU was executing a long instruction which takes up to 7 CPU cycles. Maybe there's the 5 CPU cycle penalty from a DMC sample fetch. So the worst case is 57 PPU pixels after pixel #260, bringing you to pixel #317, but DMC is unlikely, so it's more likely a worst case of #302.
Then you need a PHA, so that's 3 CPU cycles gone there. But then you can pull off the writes quickly with LDA #xx, STA $200x.
You could also use 3 writes so you can reset $2000.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Dwedit wrote:So your interrupt fires at ppu pixel #260, and it takes 7 CPU cycles to respond to an interrupt. Maybe the CPU was executing a long instruction which takes up to 7 CPU cycles. Maybe there's the 5 CPU cycle penalty from a DMC sample fetch. So the worst case is 57 PPU pixels after pixel #260, bringing you to pixel #317, but DMC is unlikely, so it's more likely a worst case of #302.
Then you need a PHA, so that's 3 CPU cycles gone there. But then you can pull off the writes quickly with LDA #xx, STA $200x.
There's not enough time. 302 + 9 (PHA) + 9 (LDA) + 12 (STA) + 9 (LDA) + 12 (STA) = 353, which is more than the 341 cycles of a scanline. And that's considering that the IRQ vector points directly to the code that will make the PPU writes, which is not the case.

Also, aren't the last cycles of HBlank used to fetch the first patterns of the next scanline? So, ideally you'd want to have the scroll already configured by then, or else the next scanline might look glitched. I'm pretty sure that for the MMC3 the best thing is to have the IRQ fire earlier, so that you have a whole scanline to prepare whatever you need.

EDIT: In case anyone thinks that having the IRQ fire earlier is a waste of time, I just wanted to say that it isn't, because you can just do calculations that you would do elsewhere during this time. In addition to calculating the scroll values, you could set up the next raster effect (if any) by configuring the next IRQ and updating the IRQ address for example. You know, basic maintenance related to the raster effects.
User avatar
Dwedit
Posts: 4470
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

What if the loads are 2-cycle immediate rather than 3-cycle from zero page? (ie, interrupt routine in RAM, just consists of 14 bytes:
pha
lda #xx
sta xxxx
lda #xx
sta xxxx
jmp xxxx
Then worst case is 347, just 6 pixels too far.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Dwedit wrote:Then worst case is 347, just 6 pixels too far.
If you are masking the leftmost 8 pixels (something not everyone does, I for example never do it) then 6 pixels is not a problem. I'm no guru on the internal workings of the PPU, but like I said in my last post, I believe that the first couple background patterns are fetched sometime between cycles 320 and 341 of the previous scanline, so finishing the scroll change after that could result in up to 16 pixels of the scanline being glitched, which would be pretty bad.

EDIT: If you look at this document, under the "Memory fetch phase 161 thru 168" text, it says that the PPU reads data for drawing 2 tiles on the next scanline. Now, "fetch phase 161" is just after PPU cycle 320, so if the scroll isn't updated by then, the beginning of the next scanline will probably show the wrong tiles.
Post Reply