Changing the scroll every 2 scanlines
Moderator: Moderators
Changing the scroll every 2 scanlines
In another thread I considered drawing a raycaster's viewport using tiles that contain only 1 software pixel in the Y axis, and vertically compressing 2 name tables (60 tiles) into a 120-pixel tall area by changing the scroll every 2 scanlines. Changing the scroll isn't much of a problem, since the fine Y scroll isn't used at all and 2 $2006 writes can take care of it, but doing it without wasting massive amounts of CPU time is looking like a challenge.
Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes. MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away. Waiting for the next scanline would mean a little over 50% of the time spent squeezing the viewport, which is much better but still fairly expensive.
What would the alternative be? A cycle-based IRQ counter? With that I could time the IRQs so there'd be no waiting at all, and all the stolen time would go towards actually changing the scroll, which I imagine will be around 30% of the time, which sounds more reasonable. Problem is these mappers aren't as easy to come as the MMC3, and I don't know if they'll play nice with 4-screen mirroring across different emulators and flash carts.
Can anyone think of other ways to vertically squeeze the screen without sacrificing too much CPU time?
Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes. MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away. Waiting for the next scanline would mean a little over 50% of the time spent squeezing the viewport, which is much better but still fairly expensive.
What would the alternative be? A cycle-based IRQ counter? With that I could time the IRQs so there'd be no waiting at all, and all the stolen time would go towards actually changing the scroll, which I imagine will be around 30% of the time, which sounds more reasonable. Problem is these mappers aren't as easy to come as the MMC3, and I don't know if they'll play nice with 4-screen mirroring across different emulators and flash carts.
Can anyone think of other ways to vertically squeeze the screen without sacrificing too much CPU time?
Re: Changing the scroll every 2 scanlines
It'd be fairly easy to trigger an IRQ when the PPU fetches a specific tile or location from the nametable... although the exact phase you'd need for this application might require some experimentation.
- rainwarrior
- Posts: 8062
- Joined: Sun Jan 22, 2012 12:03 pm
- Location: Canada
- Contact:
Re: Changing the scroll every 2 scanlines
I was thinking you could use looping DPCM IRQ that fires at a fixed rate. It can recur at ~4 scanlines, I think? Might be more realistic to do 3 scanlines per scroll instead of 2.
1. Wait a special number of cycles (you'll need to create a table mapping the IRQ timings during the frame to their scanline positions, since each IRQ is going to fire in a different but specific horizontal position).
2. Set a scroll for this scanline.
3. CPU wait for 2 (or 3) scanlines, set scroll a second time.
4. Return from interrupt to resume executing arbitrary code.
You'd have to do a lot of CPU waiting, but at least you could have maybe 30% free time outside the IRQ response.
Actually... does IRQ still work on looping DPCM? If not, accumulated jitter would probably kill this idea.
1. Wait a special number of cycles (you'll need to create a table mapping the IRQ timings during the frame to their scanline positions, since each IRQ is going to fire in a different but specific horizontal position).
2. Set a scroll for this scanline.
3. CPU wait for 2 (or 3) scanlines, set scroll a second time.
4. Return from interrupt to resume executing arbitrary code.
You'd have to do a lot of CPU waiting, but at least you could have maybe 30% free time outside the IRQ response.
Actually... does IRQ still work on looping DPCM? If not, accumulated jitter would probably kill this idea.
Re: Changing the scroll every 2 scanlines
I'm no specialist, but $2006 writes, as well as the lower bits of the $2005 write takes effect immediately. Of course because of the jitter it will still be a major problem, as visual glitches will appear.Without a mapper, 100% of the time while the viewport is rendered would be spent on this, since I can't think of any constant-timed task that could be done between scroll changes.[/quote
You could execute a virtual machine and execute a main thread, so that each VM instruction takes a constant amount of cycles, and after every 2 or 3 instructions you do a scroll change (or any other timing sensitive operation such as a $4011 write).
Not that it'd allow your main thread to execute very fast, but at least it's better than nothing.
MMC3 IRQs fire kinda late in the scanline, enough that the scroll can't be reliably changed right away.
Re: Changing the scroll every 2 scanlines
Custom hardware is outside of the question for me. Too much work before the first line of code can be written, and the whole chicken/egg thing sucks.lidnariq wrote:It'd be fairly easy to trigger an IRQ when the PPU fetches a specific tile or location from the nametable...
While this is an interesting suggestion, my experience with DPCM IRQs has been nothing but painful. I could never get a steady effect from that thing.rainwarrior wrote:I was thinking you could use looping DPCM IRQ that fires at a fixed rate.
That's a very different approach, I like it. It would be a hell of a slow VM, but at least you'd be doing something instead of just waiting! I wouldn't do this in this particular case though, because delayed scroll changes with the MMC3 still sound faster.Bregalad wrote:You could execute a virtual machine and execute a main thread, so that each VM instruction takes a constant amount of cycles, and after every 2 or 3 instructions you do a scroll change (or any other timing sensitive operation such as a $4011 write).
I think you're right, and yes, the jittering is the problem.I'm no specialist, but $2006 writes, as well as the lower bits of the $2005 write takes effect immediately. Of course because of the jitter it will still be a major problem, as visual glitches will appear.
One thing I just realized is that I could probably do the first $2006 write at the end of the previous IRQ, which would allow me to finish setting the scroll sooner (save A, load second $2006 byte, write it). That's 10 CPU cycles + 7 to enter the IRQ plus any left over cycles from the instruction that's running when the IRQ fires, we're looking at up to 72 PPU cycles (NTSC)... plus 260, which is when MMC3 IRQs fire, that's 332, well into the first fetches for the next scanline.
I don't care if the fetched data is wrong, because the first 2 tiles are always blank, but the IRQ latency will prevent me from resetting the scroll at a constant spot every time, so there will probably be a lot of jittering.
Re: Changing the scroll every 2 scanlines
Here's another crazy thought: while the viewport is rendering, only run logic that doesn't use the X register, so it can contain the second $2006 byte always ready to be written as soon as the NMI fires. That should guarantee that the scroll is always changed before PPU cycle 320, even on PAL, right?
Losing X for a while every frame sucks, but I thing I can cast rays using only A and Y, for example.
Losing X for a while every frame sucks, but I thing I can cast rays using only A and Y, for example.
Re: Changing the scroll every 2 scanlines
You are right, but my idea doesn't require the MMC3. Also I through you hated the MMC3's IRQ (and personally I agree it's weird/inconvenient as opposed to the MMC5, the FDS or VRC series for example).That's a very different approach, I like it. It would be a hell of a slow VM, but at least you'd be doing something instead of just waiting! I wouldn't do this in this particular case though, because delayed scroll changes with the MMC3 still sound faster.
Definitely possible, but it sounds like the main thread would be extremely painful to code. The time lost because you loose a registers, which implies more memory loads and stores, will probably be on par with the time lost by waiting in each IRQ.Here's another crazy thought: while the viewport is rendering, only run logic that doesn't use the X register, so it can contain the second $2006 byte always ready to be written as soon as the NMI fires. That should guarantee that the scroll is always changed before PPU cycle 320, even on PAL, right?
Also I didn't mention it, but I think the MMC3 can fire IRQs in two different positions in the scanline depending on which pattern table (left or right) is used for BG and sprites. Have you considered both possibilities ?
Re: Changing the scroll every 2 scanlines
I do dislike how the MMC3's scanline counter kills the versatility of 8x16 sprites, but I won't be needing to use sprites from both pattern tables this time, so it's fine.
Not using X would only affect part of the main thread, which should probably be split into 2 threads because of this. I can think of a few tasks that will work decently using only A and Y. It's certainly faster than a VM with constant-timed instructions.
Yes, I have considered the alternate MMC3 IRQ timing, which fires later than in the normal setup, so it wouldn't help me set the scroll sooner, but would result in less wasted time in case I decided to wait for the next HBlank to change the scroll.
Not using X would only affect part of the main thread, which should probably be split into 2 threads because of this. I can think of a few tasks that will work decently using only A and Y. It's certainly faster than a VM with constant-timed instructions.
Yes, I have considered the alternate MMC3 IRQ timing, which fires later than in the normal setup, so it wouldn't help me set the scroll sooner, but would result in less wasted time in case I decided to wait for the next HBlank to change the scroll.
Re: Changing the scroll every 2 scanlines
I just had the random thought of making those lines blank, hiding any glitches if you change scroll mid-line (...in theory). The problem is that it pretty much creates a scanlines effect, which may come up as annoying.
Re: Changing the scroll every 2 scanlines
I don't see how that would help saving time, seeing as I'd still have to wait for the next HBlank in order to enable rendering back on... Also, in this particular case, scanlines would compromise the dithering method I plan on using to create more colors.Sik wrote:I just had the random thought of making those lines blank, hiding any glitches if you change scroll mid-line (...in theory). The problem is that it pretty much creates a scanlines effect, which may come up as annoying.
I'll probably try setting the scroll at the start of the next HBlank (at this point we can be sure there'll be no glitches), effectively wasting 1 scanline every 2 scanlines. Yes, it sucks, but the alternative of having 2 main threads (one of them unable to use X) seems like hell to manage.
Re: Changing the scroll every 2 scanlines
I don't quite understand what you're trying to do, perhaps post a mock screenshot?
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Re: Changing the scroll every 2 scanlines
Someone is using the top 2 rows of each tile as a constant CHR table and rendering to nametables, with each map entry representing an 8x2 pixel area of the screen. This requires 4-screen mirroring (TVROM) and requires changing the scroll every 2 scanlines. See FMV on the NES for more on this technique. And if you don't want to hog the CPU for the entire picture, you need to have a mapper-generated interrupt trigger the writes to the scroll register that change the scroll.
Re: Changing the scroll every 2 scanlines
tepples is better with words than I am, but here's my explanation anyway:
I'm considering rendering the graphics for a raycaster this way. Each tile contains 2 software pixels, one on the left and one on the right, so these pixels are really tall. I don't want them to be this tall, but this is the only way I can store all possible combinations of colors in under 256 tiles, so that I don't have to update the pattern tables during gameplay. In order to have more acceptable pixels I want to display only 2 rows of each tile and skip the other 6.
The ultimate goal is to resize a 256x480-pixel area to 256x120, and for that I need to change the scroll every 2 scanlines. I can't spend a lot of CPU time on this though, because raycasting is already a very CPU-intensive task.
And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.
I'm considering rendering the graphics for a raycaster this way. Each tile contains 2 software pixels, one on the left and one on the right, so these pixels are really tall. I don't want them to be this tall, but this is the only way I can store all possible combinations of colors in under 256 tiles, so that I don't have to update the pattern tables during gameplay. In order to have more acceptable pixels I want to display only 2 rows of each tile and skip the other 6.
The ultimate goal is to resize a 256x480-pixel area to 256x120, and for that I need to change the scroll every 2 scanlines. I can't spend a lot of CPU time on this though, because raycasting is already a very CPU-intensive task.
And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.
-
Celius
- Posts: 2159
- Joined: Sun Jun 05, 2005 2:04 pm
- Location: Minneapolis, Minnesota, United States
- Contact:
Re: Changing the scroll every 2 scanlines
Another explanation: he's trying to basically scale a really tall image vertically by changing the Y scroll every 2 scanlines. Unaltered, the image would span 2 nametables and would appear to be stretched really tall. Scrunching the image makes it appear normal.Dwedit wrote:I don't quite understand what you're trying to do, perhaps post a mock screenshot?
Genius!tokumaru wrote:And here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.
Re: Changing the scroll every 2 scanlines
Very clever indeed. I can just see how you got excitedAnd here's why explaining your ideas to other people is good: I JUST HAD AN IDEA: Instead of changing the scroll every 2 scanlines, I can do it every 4 scanlines! I just have to display the bottom of a row of tiles followed by the top of the one below it. Losing 1 scanline out of 4 is much more acceptable than what I was considering before.
The only issue is that scrolling to line #6 of fine scroll is more annoying than scrolling to line #0, #1, #2 or #3 because it can't be done with $2006 alone, but I guess this is a minor issue in your case.
Sounds like a raycaster with decent graphics and framerate is on the way to go
I can't belive I completely missed this topic back then. The problem is that while the technique is impressive, the demoes FMV themselves are very unimpressive. Probably it would take real handcrafted artistic work to make this meaningful, and this requires, well, a good artist who has lots of time.See FMV on the NES for more on this technique.