Oh, I'll do that, thanksDwedit wrote:You could also read PPUSTAT instead of a second PPUSCROLL write.
HBlank, IRQ and Sprite 0 hit
Moderator: Moderators
Quoting one of my older posts...
To do a scroll to an arbitrary position during rendering...
* Write anything with the correct name table bits to 2006. All other bits are overwritten later. (name table bits are ....xx..)
* Write the Y coordinate to 2005. Three lowest bits of Coarse Y are overwritten later.
* Write anything with the correct fine X to 2005. All bits of Coarse X are overwritten later.
* Write Coarse X (X >> 3) | Coarse Y (Y and #$38, << 2) to 2006.
So, there's 4 4-cycle writes, and 4 3-cycle loads from zeropage, unless you can stick the code into RAM where you would use immediate loads instead.
Only the final 2 writes affect scrolling, so make sure those happen in rapid succession, as in don't use the same register for the ST_ instruction.
Unless you need to do it every scanline, I would just burn cycles until the end of the scanline before doing IRQ response code. That would include the other stuff like getting ready for the next IRQ, and other stuff like that.
To do a scroll to an arbitrary position during rendering...
* Write anything with the correct name table bits to 2006. All other bits are overwritten later. (name table bits are ....xx..)
* Write the Y coordinate to 2005. Three lowest bits of Coarse Y are overwritten later.
* Write anything with the correct fine X to 2005. All bits of Coarse X are overwritten later.
* Write Coarse X (X >> 3) | Coarse Y (Y and #$38, << 2) to 2006.
So, there's 4 4-cycle writes, and 4 3-cycle loads from zeropage, unless you can stick the code into RAM where you would use immediate loads instead.
Only the final 2 writes affect scrolling, so make sure those happen in rapid succession, as in don't use the same register for the ST_ instruction.
Unless you need to do it every scanline, I would just burn cycles until the end of the scanline before doing IRQ response code. That would include the other stuff like getting ready for the next IRQ, and other stuff like that.
Code: Select all
"N" = nametable, "X" = coarse X, "Y" = coarse Y, "y" = fine y, "d" = written data
Fine Y
Nametable Y,X
Coarse Y
Coarse X
.yyyNNYYYYYXXXXX
2000 write:
t:....NN..........=d:......xx
2005 first write:
t:...........XXXXX=d:xxxxx...
x=d:.....xxx (fine X)
2005 second write:
t:......YYYYY.....=d:xxxxx...
t:.yyy............=d:.....xxx
2006 first write:
t:..yyNNYY........=d:..xxxxxx
t:.y..............=0
2006 second write:
t:........YYYXXXXX=d:xxxxxxxx
v=t
scanline start (if background and sprites are enabled):
v:.....N.....XXXXX=t:.....N.....XXXXX
frame start (line 0, ppu clock 304) (if background and sprites are enabled):
v=t
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
I can't think of a game that needed fine Y scrolling when changing the scroll mid-frame, as most did it just for status bars.Wave wrote:How do commercial games do the 2006-2005-2005-2006 trick?
The way I do it takes more time than that, and I honestly believe it doesn't get much faster than that unless you use a huge table. You can of course pre-calculate the values and buffer them, so that when the IRQ fires they are all ready to be used.I can't find a way to do it in 26 cycles
According to Disch, MMC3 IRQs will fire either at cycle 260 (normal settings) or cycle 324 (alternate settings). Both are inside HBlank already, and considering the overhead of entering the IRQ routine itself there isn't much time left to do anything. You might just want to set up the IRQ for the previous scanline, and kill the time until the next HBlank by calculating the scroll values.in MMC3 how many cycles do you have in an IRQ?
I'm sure it will work fine if you have the IRQ fire 1 scanline earlier. When modifying the scroll, only the last 2 writes (to $2005 and $2006) need to be inside HBlank, that's 8 cycles out of 28, so there's plenty of room to position those writes so that there are no glitches at all.Wave wrote:Has anyone implemented it to work on an MMC3 IRQ? Or does it produce glitches?
Hi, the list of all effects that can "technically" be done is here html version and txt version.
I know this is far too complex, I expand way too much on some aspect and make some stuff sounds overcomplicated when the original goal was the exact opposite but oh well.
This doesn't mention much in what those effect can be useful though. I'm pretty sure I mention a few effect that no existing games or demo ever did.
I already thought about making a "general purpose" library for modifying PPU registers in real time, but in the end there is too many different application for this to be useful, it's better to write a specific routine every time you need it. Especially because different registers should be written at different times to remove glitches.
I don't know if this answers the original post (I didn't read the others)
[/url]
I know this is far too complex, I expand way too much on some aspect and make some stuff sounds overcomplicated when the original goal was the exact opposite but oh well.
This doesn't mention much in what those effect can be useful though. I'm pretty sure I mention a few effect that no existing games or demo ever did.
I already thought about making a "general purpose" library for modifying PPU registers in real time, but in the end there is too many different application for this to be useful, it's better to write a specific routine every time you need it. Especially because different registers should be written at different times to remove glitches.
I don't know if this answers the original post (I didn't read the others)
Useless, lumbering half-wits don't scare us.
If you want to scroll faster using MMC3 interrupts, you can do the first two writes (2006,2005) early, as the first 2006 write only affects X nametable selection, and has no other effect on scrolling.
Then you can do the 2005,2006 writes later, which affect visible X and Y scrolling.
So your interrupt fires at ppu pixel #260, and it takes 7 CPU cycles to respond to an interrupt. Maybe the CPU was executing a long instruction which takes up to 7 CPU cycles. Maybe there's the 5 CPU cycle penalty from a DMC sample fetch. So the worst case is 57 PPU pixels after pixel #260, bringing you to pixel #317, but DMC is unlikely, so it's more likely a worst case of #302.
Then you need a PHA, so that's 3 CPU cycles gone there. But then you can pull off the writes quickly with LDA #xx, STA $200x.
You could also use 3 writes so you can reset $2000.
Then you can do the 2005,2006 writes later, which affect visible X and Y scrolling.
So your interrupt fires at ppu pixel #260, and it takes 7 CPU cycles to respond to an interrupt. Maybe the CPU was executing a long instruction which takes up to 7 CPU cycles. Maybe there's the 5 CPU cycle penalty from a DMC sample fetch. So the worst case is 57 PPU pixels after pixel #260, bringing you to pixel #317, but DMC is unlikely, so it's more likely a worst case of #302.
Then you need a PHA, so that's 3 CPU cycles gone there. But then you can pull off the writes quickly with LDA #xx, STA $200x.
You could also use 3 writes so you can reset $2000.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
There's not enough time. 302 + 9 (PHA) + 9 (LDA) + 12 (STA) + 9 (LDA) + 12 (STA) = 353, which is more than the 341 cycles of a scanline. And that's considering that the IRQ vector points directly to the code that will make the PPU writes, which is not the case.Dwedit wrote:So your interrupt fires at ppu pixel #260, and it takes 7 CPU cycles to respond to an interrupt. Maybe the CPU was executing a long instruction which takes up to 7 CPU cycles. Maybe there's the 5 CPU cycle penalty from a DMC sample fetch. So the worst case is 57 PPU pixels after pixel #260, bringing you to pixel #317, but DMC is unlikely, so it's more likely a worst case of #302.
Then you need a PHA, so that's 3 CPU cycles gone there. But then you can pull off the writes quickly with LDA #xx, STA $200x.
Also, aren't the last cycles of HBlank used to fetch the first patterns of the next scanline? So, ideally you'd want to have the scroll already configured by then, or else the next scanline might look glitched. I'm pretty sure that for the MMC3 the best thing is to have the IRQ fire earlier, so that you have a whole scanline to prepare whatever you need.
EDIT: In case anyone thinks that having the IRQ fire earlier is a waste of time, I just wanted to say that it isn't, because you can just do calculations that you would do elsewhere during this time. In addition to calculating the scroll values, you could set up the next raster effect (if any) by configuring the next IRQ and updating the IRQ address for example. You know, basic maintenance related to the raster effects.
What if the loads are 2-cycle immediate rather than 3-cycle from zero page? (ie, interrupt routine in RAM, just consists of 14 bytes:
pha
lda #xx
sta xxxx
lda #xx
sta xxxx
jmp xxxx
Then worst case is 347, just 6 pixels too far.
pha
lda #xx
sta xxxx
lda #xx
sta xxxx
jmp xxxx
Then worst case is 347, just 6 pixels too far.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
If you are masking the leftmost 8 pixels (something not everyone does, I for example never do it) then 6 pixels is not a problem. I'm no guru on the internal workings of the PPU, but like I said in my last post, I believe that the first couple background patterns are fetched sometime between cycles 320 and 341 of the previous scanline, so finishing the scroll change after that could result in up to 16 pixels of the scanline being glitched, which would be pretty bad.Dwedit wrote:Then worst case is 347, just 6 pixels too far.
EDIT: If you look at this document, under the "Memory fetch phase 161 thru 168" text, it says that the PPU reads data for drawing 2 tiles on the next scanline. Now, "fetch phase 161" is just after PPU cycle 320, so if the scroll isn't updated by then, the beginning of the next scanline will probably show the wrong tiles.