Native NES Tracker

Discuss NSF files, FamiTracker, MML tools, or anything else related to NES music.

Moderator: Moderators

tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Kasumi wrote: The much smaller SRAM space NES has becomes a problem though
On both the Game Boy and the NES, PRG RAM is an 8 KiB memory area, but Game Boy mappers are more likely to support bankswitching this. The only boards I know of with more than 8 KiB of PRG RAM are SXROM and EWROM. (SOROM and ETROM back up only half of their 16 KiB PRG RAM.) Neither of these is common: SXROM is Japan-only, and EWROM was used for one game: Romance of the 3 Kingdoms 2. More importantly, I'm not sure how widely supported they are in emulators and copiers originating from 72-pin territories. If it doesn't work on PowerPak and PocketNES, there really isn't much of an advantage of an NES-native app; one might as well use an app for a netbook that compiles NSF to play through Blargg's Game_Music_Emu.
User avatar
neilbaldwin
Posts: 481
Joined: Tue Apr 28, 2009 4:12 am
Contact:

Post by neilbaldwin »

Thanks Tepples, Kasumi - some interesting and useful information.

I have to say I'm a little less optimistic. It's a real shame - I think the 8K RAM is the Achilles Heel of the whole concept. Dividing up the RAM for the necessary data tables, it doesn't leave you much room to play with in terms of storing music data. And because the normal RAM is so small there doesn't seem much scope for using that as a work area then compressing the data to SRAM for saving.

My original though was to utilise the entire SRAM for one song. On PowerPak, could this be workable? Would it be easy to manage several different save files?
User avatar
neilbaldwin
Posts: 481
Joined: Tue Apr 28, 2009 4:12 am
Contact:

Post by neilbaldwin »

Progress Update

First off I totally redesigned the layout and have come up with what seems (through consulting a few tracker users) an acceptable layout and a (make-the-best-of-limited-resources) SRAM memory map.

I figured one of the most important aspects of how the editor will feel is to have a quick as possible screen data refreshing. I've coded a scalable refresh system for each "panel" of information - you specify how many lines and how many lines-per frame need updating. What this means is that you can make sure you specify a as-fast-as-possible refresh speed for the editing "panel" you are currently in and any other panels that don't currently have focus can be instructed to refresh at a slower speed. It works really well so far.

However, to get the maximum possible speed, I'm doing absolute read/writes from buffers to screen i.e.

lda buffer
sta $2007
lda buffer+1
sta $2007
lda buffer+2
sta $2007

and so on. Thus all my "screen writes" go to the buffers and the buffers are instructed when/how much to throw at the screen per frame.

While this is fast, the downside is that it's taking up a massive amount of code space (there's a total of over 600 lda/sta pairs to handle the various buffers).

Just wondering if anyone has any clever ideas for a fast-but-ROM-space-efficient DMA methods?

I might end up optimising the speed downwards once the editor is fully working but I can't tell yet how it's going to pan out.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

neilbaldwin wrote:Just wondering if anyone has any clever ideas for a fast-but-ROM-space-efficient DMA methods?
I wrote about the trick I use for fast Name Table updates in this post. With careful manipulation of the index register you can have an unrolled transfer loop just as fast as the one you currently have but more versatile because the source address is not so hardcoded, it can be manipulated to some extent.

The current implementation of this technique that I'm using has only 32 pairs of LDA STA (instead of the 128 like in the sample code) because that's the longest a contiguous NT update will be, so it doesn't take much ROM. That limits me to copying up to 32 bytes at a time from anywhere in a buffer of 128 bytes.

I don't know what exactly you are updating (Name Tables? Pattern Tables?) and how the data is divided, so I can't give more specific suggestions, but with some indexing and by breaking your long runs of LDA STA into smaller ones (partially unrolled loops) it should be possible to reduce ROM use without much loss of speed.
User avatar
neilbaldwin
Posts: 481
Joined: Tue Apr 28, 2009 4:12 am
Contact:

Post by neilbaldwin »

tokumaru wrote:
neilbaldwin wrote:Just wondering if anyone has any clever ideas for a fast-but-ROM-space-efficient DMA methods?
I wrote about the trick I use for fast Name Table updates in this post. With careful manipulation of the index register you can have an unrolled transfer loop just as fast as the one you currently have but more versatile because the source address is not so hardcoded, it can be manipulated to some extent.

The current implementation of this technique that I'm using has only 32 pairs of LDA STA (instead of the 128 like in the sample code) because that's the longest a contiguous NT update will be, so it doesn't take much ROM. That limits me to copying up to 32 bytes at a time from anywhere in a buffer of 128 bytes.

I don't know what exactly you are updating (Name Tables? Pattern Tables?) and how the data is divided, so I can't give more specific suggestions, but with some indexing and by breaking your long runs of LDA STA into smaller ones (partially unrolled loops) it should be possible to reduce ROM use without much loss of speed.
It's all NT transfers. I have 5 areas of the screen (22 x 6, 27 x 12, 5 x 6, 5 x 6, 5 x 6, 5 x 14) that need updating so I have a set of LDA/STA sequences for each area. I've split them into lines so, for example, the first area has 6 subroutines that writes 22 characters from the buffer to the screen. If I want to update one line of the area I can. If I need to update the whole area I call each of the six subroutines in turn.

Ironically it's the 27 x 12 one that needs updating the fastest :)
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Yeah, that's a lot of stuff to update. The 27 x 12 area alone is already too much to do in a single (NTSC) VBlank, even with the fastest unrolled code possible.

If you don't need to update the screen every frame (which is probably the case, since you're showing information for humans to read and they just can't read stuff that changes as fast as 60Hz), a full Name Table can comfortably be written in 4 VBlanks, so you could take your time and draw all the new info to the hidden Name Table and display it when it's ready, so screen updates would happen at 15Hz, maybe this is enough for your purposes.

What exactly is the 27 x 12 area? If by any chance it's a window with scrolling stuff (I read something like this about your tracker), instead of doing software scrolling you could do hardware scrolling by drawing the scrolling text/numbers to the alternate NT and use a split screen effect to display any portion of it that you wish. This way you'd only have to update a single line at a time, much like games that scroll (they only draw the new tiles that enter the screen, the parts that were already visible are just shifted by hardware scrolling).

EDIT: BTW, in this thread I saw that you were having problems with memory constraints. I don't know if you have solved that or not, but I figured I'd suggest this anyway: If there are free tiles at the sprite or background side of the pattern tables, you could use CHR-RAM (if you aren't already), group all the unused space together and use it as an extra block of memory. Of course this memory is much more difficult to use than regular RAM or WRAM (since you can only access it through $2006/7 and while not rendering), but it's still usable if you're short on memory. I'm sure there is some data that doesn't need constant access that could be stored there.
User avatar
neilbaldwin
Posts: 481
Joined: Tue Apr 28, 2009 4:12 am
Contact:

Post by neilbaldwin »

tokumaru wrote:Yeah, that's a lot of stuff to update. The 27 x 12 area alone is already too much to do in a single (NTSC) VBlank, even with the fastest unrolled code possible.

If you don't need to update the screen every frame (which is probably the case, since you're showing information for humans to read and they just can't read stuff that changes as fast as 60Hz), a full Name Table can comfortably be written in 4 VBlanks, so you could take your time and draw all the new info to the hidden Name Table and display it when it's ready, so screen updates would happen at 15Hz, maybe this is enough for your purposes.

What exactly is the 27 x 12 area? If by any chance it's a window with scrolling stuff (I read something like this about your tracker), instead of doing software scrolling you could do hardware scrolling by drawing the scrolling text/numbers to the alternate NT and use a split screen effect to display any portion of it that you wish. This way you'd only have to update a single line at a time, much like games that scroll (they only draw the new tiles that enter the screen, the parts that were already visible are just shifted by hardware scrolling).

EDIT: BTW, in this thread I saw that you were having problems with memory constraints. I don't know if you have solved that or not, but I figured I'd suggest this anyway: If there are free tiles at the sprite or background side of the pattern tables, you could use CHR-RAM (if you aren't already), group all the unused space together and use it as an extra block of memory. Of course this memory is much more difficult to use than regular RAM or WRAM (since you can only access it through $2006/7 and while not rendering), but it's still usable if you're short on memory. I'm sure there is some data that doesn't need constant access that could be stored there.
Yeah, far too much :)

As I said in the previous post, I have a dynamic drawing system so I can tell the drawing routine to only DMA 1 line per-frame or as many as there is time in the VBLANK to process. That's not really the problem. Thanks for all the information and suggestions though, it's really appreciated.

The big 27 x 12 area is the note data editing window. Yes, essentially I'm software scrolling it (or will be once it's written). It's a window onto data that can be a maximum of 48 lines deep (48 steps in a sequence of note data). The "12" in the dimensions is because I'm only showing 12 lines at any time. I don't really get how hardware scrolling could help me there. I know how hardware scrolling works (in principle) but surely the problem is similar - I'd still have to be able to shift 11 x 27 lines of text up or down somehow once I've scrolled the other screen 8 pixels?

That CHAR-RAM idea is nice, didn't think of that :)
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

neilbaldwin wrote:The big 27 x 12 area is the note data editing window. Yes, essentially I'm software scrolling it (or will be once it's written). It's a window onto data that can be a maximum of 48 lines deep (48 steps in a sequence of note data). The "12" in the dimensions is because I'm only showing 12 lines at any time. I don't really get how hardware scrolling could help me there.
Excuse my crappy drawing, but look a this:

Image

Here are the 2 Name Tables. The first one has the basic screen layout you want your interface to have, and the second one contains nothing but the list of note data. To display the list in the red area of the first NT, you'll have to use a sprite hit, IRQ, whatever to detect the start of the red area. At that point you modify $2005/$2006 to point to the first line of the note list you want to display. After 96 scanlines (12 lines of data) you change the scroll back to the first NT.

You have to use vertical mirroring, so that the list wraps around, and then you'll only need to draw a single row at the bottom or the top of the list as new information needs to be displayed (no need to update the whole 12 lines, hadware scrolling will take care of displacing the other 11), exactly like any game that scrolls vertically.

Of course the process of switching from one NT to the other varies a little depending on where you want the red box to be. It's easier if it's at the very top or the very bottom of the screen, because then you'll need to switch Name Tables only once during the frame, but if you have the means to count 96 scanlines (IRQs or timed code, for example) it isn't much harder to have it by the middle.

another advantage of this is that you can even smooth scroll the list if you want to, so that it looks more pleasing.
User avatar
neilbaldwin
Posts: 481
Joined: Tue Apr 28, 2009 4:12 am
Contact:

Post by neilbaldwin »

Maybe I'm misunderstanding but it seems that would work as long as you had no more than 30 lines to display (i.e. the whole of the 2nd NT).

If you need to scroll 64 lines, surely there is still some manual shifting of the data to do?

In other news: after reading your other thread I changed the two largest unrolled loops to be only half-unrolled. Works like a charm :)
Bananmos
Posts: 551
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Post by Bananmos »

Yes, you will run out of lines eventually, but rather than shift all data manually you can adjust to the fact that the nametable will wrap around from line 29 to line 0 again, and write your lines there. Essentially, you'll need to treat the second nametable as a kind of circular buffer for pattern lines.

Of course, this makes updating a little more akward and means you cannot keep your entire PPU version in the pattern all at once, only a portion of it, but it'll still save you PPU updates with the tradeoff of more complicated code for the nametable updates.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

In case something wasn't clear, here are two animations that might help.

Scrolling:

Image

The red square is the new scroll point you have to set during rendering. The red box shows what part of the second NT gets displayed during rendering, as seen in the animation at the left.

Updating (when necessary):

Image

Here you can see how wrapping works. As long as you have NT mirroring set to vertical, setting the scroll to after line 18 will cause line 00 to be automatically displayed after line 29. So all you have to do is rewrite line 00 with new information. As long as you rewrite the lines before they are displayed, it will look like a continuous list. Lines that need rewriting are highlighted in red in the animation.

As you can see, you only need to rewrite 1 line at a time. Sure it adds some complexity (not much, really), but IMO it's better than reducing the frame rate or shoving too much stuff in the NMI. As a plus, like I said before, you get the chance to use smooth scrolling, which looks much better IMO.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Bananmos wrote:this makes updating a little more akward
The updating is actually pretty simple. If instead of having special checks to update only when it wraps, it's much simpler start only with lines 00-11 drawn (leave anything on lines 12-29, as they won't be displayed initially anyway), and always draw a new line just below the red box (or above, if you scroll up) when scrolling happens. It's a pretty strightforward process.

What is a bit awkward, IMO, is having to switch from one NT to the other and then back, but it's perfectly possible and if done right it will look perfect, no user will be able to tell what's happening.
Bananmos
Posts: 551
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Post by Bananmos »

Actually, I'm not 100% sure that smooth scrolling is such a good idea in this case... it'll sure look nicer, but it might also be a little deceiving since the lines in a tracker layout visualize how each line is a discrete entity, whose effect takes place the instant you switch to it, and having things go smoother might make this less obvious. Then again, this may be a non-issue that won't give any bother after all. Only way to know is to compare and see for yourself :)
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

tokumaru wrote:a full Name Table can comfortably be written in 4 VBlanks, so you could take your time and draw all the new info to the hidden Name Table and display it when it's ready, so screen updates would happen at 15Hz, maybe this is enough for your purposes.
Unless you're tracking at 150 BPM (tempo $96) with 32nd note resolution (speed 3).
What exactly is the 27 x 12 area? If by any chance it's a window with scrolling stuff (I read something like this about your tracker), instead of doing software scrolling you could do hardware scrolling by drawing the scrolling text/numbers to the alternate NT and use a split screen effect to display any portion of it that you wish.
What's the best way to do more than one split (one at the top of the region, one at the bottom) on a mapper without a scanline or CPU cycle counter and without wasting over one-third of the CPU time? Or what mapper with such a counter is replicable?
you could use CHR-RAM (if you aren't already), group all the unused space together and use it as an extra block of memory. Of course this memory is much more difficult to use than regular RAM or WRAM (since you can only access it through $2006/7 and while not rendering)
Given that someone's already having trouble fitting things into 2300 cycles, I'm not sure how helpful that would be.
User avatar
Memblers
Site Admin
Posts: 3901
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Post by Memblers »

I know this is kinda going against the grain here, but there is one advantage to having the patterns not scroll at all - you can edit it while it's playing. Though this could come off as looking kind of cheap, to anyone who doesn't care about that feature.
Post Reply