za909 "helpme" - thread
Moderator: Moderators
za909 "helpme" - thread
I figured I should not clog up the forum by asking a question every so often, so I made this thread...
Currently I'm struggling with understanding what CHR-RAM exactly is. So it's a 8k RAM chip on the board... but how do I use it? The confusion first began when I read about converting an NROM game to use CHR RAM, which seemingly copy-pasted all the tile data in the PPU "manually" using $2007. Then I had a look at a few UNROM games, to see where that RAM is present. Emulators only show a huge wall of simultaneously changing bytes, including my own assembled UNROM project, which does nothing to the PPU yet other than initializing it and performing OAM transfer in NMI.
This made me even more confused...
Then I checked online what an UNROM board looks like, to see if there really is RAM there. It is, so now I have no idea how I need to use CHR RAM. Do I simply copy from PRG to $6000-$7FFF and whatever I put in $6000, the PPU will automatically see at PPU $0000? It would make sense considering that CHR ROM is connected and the PPU "sucks in" whatever a mapper allows it to see. Or do I need to take the graphics data from PRG ROM, copy it to CHR RAM, and then copy it again to the PPU via $2007 manually? This would make the whole idea of using a RAM chip almost pointless, because you could just copy stuff directly with $2007 and leave the RAM completely unused. I know the truth is somewhere in there, but I can't see it myself.
Also, how does construcing a cart work with UNROM donors, if they have the wrong nametable mirroring soldered? Can you simply remove that and solder the other one?
Currently I'm struggling with understanding what CHR-RAM exactly is. So it's a 8k RAM chip on the board... but how do I use it? The confusion first began when I read about converting an NROM game to use CHR RAM, which seemingly copy-pasted all the tile data in the PPU "manually" using $2007. Then I had a look at a few UNROM games, to see where that RAM is present. Emulators only show a huge wall of simultaneously changing bytes, including my own assembled UNROM project, which does nothing to the PPU yet other than initializing it and performing OAM transfer in NMI.
This made me even more confused...
Then I checked online what an UNROM board looks like, to see if there really is RAM there. It is, so now I have no idea how I need to use CHR RAM. Do I simply copy from PRG to $6000-$7FFF and whatever I put in $6000, the PPU will automatically see at PPU $0000? It would make sense considering that CHR ROM is connected and the PPU "sucks in" whatever a mapper allows it to see. Or do I need to take the graphics data from PRG ROM, copy it to CHR RAM, and then copy it again to the PPU via $2007 manually? This would make the whole idea of using a RAM chip almost pointless, because you could just copy stuff directly with $2007 and leave the RAM completely unused. I know the truth is somewhere in there, but I can't see it myself.
Also, how does construcing a cart work with UNROM donors, if they have the wrong nametable mirroring soldered? Can you simply remove that and solder the other one?
Re: za909 "helpme" - thread
Great, I'm not 100% on this so now someone can correct me if I'm wrong, but here it goes:
When you are using CHR-RAM, you are simply storing the gfx data in the PRG-ROM, and then use
your own routines to copy that data to the gfx RAM in the PPU. The CHR-ROM way stores the gfx
on a separate ROM chip on the GamePAK.
The $6000-7FFF area is WRAM or Work RAM. It's just extra RAM you can add to a GamePAK that
your game can use.
CHR-RAM can be good if you want to animate your background, like creating faux parallax-scrolling, or
compress the gfx data(don't know if any NES game does this).
When you are using CHR-RAM, you are simply storing the gfx data in the PRG-ROM, and then use
your own routines to copy that data to the gfx RAM in the PPU. The CHR-ROM way stores the gfx
on a separate ROM chip on the GamePAK.
The $6000-7FFF area is WRAM or Work RAM. It's just extra RAM you can add to a GamePAK that
your game can use.
CHR-RAM can be good if you want to animate your background, like creating faux parallax-scrolling, or
compress the gfx data(don't know if any NES game does this).
Re: za909 "helpme" - thread
The CHR RAM chip lies under the full control of PPU (the PPU bus), so writes to $2006/$2007 during vblank and forced blank is the only way to do it.
Unfortunately no. like DoNotWant said, $6000-$7FFF is a completely separate optional RAM chip that lies in the CPU/PRG bus.za909 wrote:Do I simply copy from PRG to $6000-$7FFF and whatever I put in $6000, the PPU will automatically see at PPU $0000?
Better said as "you are reading the gfx data contained in the PRG-ROM".DoNotWant wrote:When you are using CHR-RAM, you are simply storing the gfx data in the PRG-ROM...
I think so.za909 wrote:Also, how does construcing a cart work with UNROM donors, if they have the wrong nametable mirroring soldered? Can you simply remove that and solder the other one?
Re: za909 "helpme" - thread
In the same way that a game copies 30 rows of nametable data, each 32 bytes in length, to $2000-$23BF, a CHR RAM game copies up to 512 tiles of pattern table data, each 16 bytes in length, to $0000-$1FFF. See Switching to CHR RAM.
Re: za909 "helpme" - thread
Thank you, so now I understand how this works. Simply the name was very misleading to me because it made me assume that this RAM being mentioned is something on the cart and not the PPU pattern RAM $0000-$1FFF.
I assume it's very easy to make the faux background parallax scrolling by ROL-ing or ROR-ing twice through all 16 bytes of a tile, I just have to get the lowest bit from the last one into the carry to start off.
Also, mirrored sprites could be easier to make this way, becaus instead of physically swapping the left and right sprites, I can swap them in CHR RAM instead. At least for the player I could do that, since I will never draw anything during VBlank as no scrolling will be used for the playfield, only fadeouts, drawing the next room with rendering turned off, and then reenabling it next VBlank when it's finished. This gives me a lot of VBlank time to use for tile animations and palette changes.
Now that sound is fully operational, I'm wondering if there's a better way to test run times than putting a breakpoint at the jsr to the Play routine and then Stepping over in the FceuX debugger. Some automated test would be much better because I usually run everything in 1300-1400 cycles when not much is happening, but when a sound effect plays and/or music commands are read and processed, I can get around 2300 cycles. I kind of need to know what the absolute biggest workload can be, and not just assume based on averages and high peaks I happened to catch manually.
I assume it's very easy to make the faux background parallax scrolling by ROL-ing or ROR-ing twice through all 16 bytes of a tile, I just have to get the lowest bit from the last one into the carry to start off.
Also, mirrored sprites could be easier to make this way, becaus instead of physically swapping the left and right sprites, I can swap them in CHR RAM instead. At least for the player I could do that, since I will never draw anything during VBlank as no scrolling will be used for the playfield, only fadeouts, drawing the next room with rendering turned off, and then reenabling it next VBlank when it's finished. This gives me a lot of VBlank time to use for tile animations and palette changes.
Now that sound is fully operational, I'm wondering if there's a better way to test run times than putting a breakpoint at the jsr to the Play routine and then Stepping over in the FceuX debugger. Some automated test would be much better because I usually run everything in 1300-1400 cycles when not much is happening, but when a sound effect plays and/or music commands are read and processed, I can get around 2300 cycles. I kind of need to know what the absolute biggest workload can be, and not just assume based on averages and high peaks I happened to catch manually.
Re: za909 "helpme" - thread
It is on the cart. The cart controls what memory is mapped to each PPU region within $0000-$2FFF. It can map a given address to its own memory (the CHR RAM) or to the 2K video memory in the console. In fact, a few carts (Gauntlet, Rad Racer II, and Napoleon Senki) map nametable addresses to a RAM chip in the cart, and one (Magic Floor) uses half the video memory in the console for pattern table instead of nametable. But for the vast majority of carts, $0000-$1FFF is on the cart, and $2000-$2FFF is in the console (with some form of mirroring). And the UNROM board contains a RAM that it maps to $0000-$1FFF.za909 wrote:Thank you, so now I understand how this works. Simply the name was very misleading to me because it made me assume that this RAM being mentioned is something on the cart and not the PPU pattern RAM $0000-$1FFF.
To visually see how long your code is taking, you can turn on bit 0 of PPU port $2001 (layer enable and tint control) for about 340 cycles and then turn it off. This temporarily forces the PPU to use column 0 of the palette (light grays) for a few scanlines, which should produce a gray stripe across your screen. The lower the stripe, the more CPU time you're using.I'm wondering if there's a better way to test run times than putting a breakpoint at the jsr to the Play routine and then Stepping over in the FceuX debugger. Some automated test would be much better because I usually run everything in 1300-1400 cycles when not much is happening, but when a sound effect plays and/or music commands are read and processed, I can get around 2300 cycles. I kind of need to know what the absolute biggest workload can be, and not just assume based on averages and high peaks I happened to catch manually.
Code: Select all
PPUMASK = $2001
BG_ON = %00001010
OBJ_ON = %00010100
LIGHTGRAY = %00000001
; Draw a light gray bar for 3 scanlines, which is 341 cycles
draw_timing_stripe:
ldy #BG_ON|OBJ_ON|LIGHTGRAY
sty PPUMASK
ldy #67 ; this is (340/5) - 1
@wait_around:
dey
bne wait_around
ldy #BG_ON|OBJ_ON
sty PPUMASK
rts
Re: za909 "helpme" - thread
Simple, yes, but also slow. VBlank time is quite short (about 2273 CPU cyles), so even with a lot of tricks and code unrolling you can't send more than 20 or so tiles each frame. With forced blank you can maybe double that number before players notice that a big part of the screen is missing.za909 wrote:I assume it's very easy to make the faux background parallax scrolling by ROL-ing or ROR-ing twice through all 16 bytes of a tile
But that would be if you were only updating patterns, which is almost never the case, since you also have to update sprites, palettes, name tables, and so on.
You might enjoy programming for the Master System, where there's no hardware sprite flipping (it has background flipping though, which the NES lacks), but on the NES this is not practical at all, because of the short VBlank time I already mentioned.Also, mirrored sprites could be easier to make this way, becaus instead of physically swapping the left and right sprites, I can swap them in CHR RAM instead.
If you think that physically mirroring the sprite positions is too much trouble, a better solution is to write 2 metasprite definitions for each animation frame, one facing left and another facing right.
You'll have to do the math and decide if this is worth the trouble. Even if you're doing nothing more than a sprite DMA and setting the scroll in your VBlank handler, there's only enough time left for updating 12 or so tiles, with highly optimized code.At least for the player I could do that, since I will never draw anything during VBlank as no scrolling will be used for the playfield
Re: za909 "helpme" - thread
Thanks, I got it to work and at least I made a (for the time being, not very tidy) sprite 0 hit loop to wait for the end of the prerender line in and then for the sprite 0 hit, since I don't need any kind of screen split. It looks fairly consistent, but every so often a huge peak occurs, it's good to know, hopefully I'll see the improvement as I'm trying to hunt down badly optimised routines. Don't mind the periodically occuring test sound effect.
EDIT: I was relying on the FceuX OLD PPU startup state with the pattern tables, so it would not work on most emulators, I fixed that.
EDIT: I was relying on the FceuX OLD PPU startup state with the pattern tables, so it would not work on most emulators, I fixed that.
- Attachments
-
- Test fixed.nes
- (128.02 KiB) Downloaded 165 times
-
- Test.nes
- (128.02 KiB) Downloaded 155 times
Last edited by za909 on Sun Apr 12, 2015 10:40 am, edited 1 time in total.
Re: za909 "helpme" - thread
If each character is 16x32 pixels, each cel of animation is 8 tiles. This means you can easily upload one cel to VRAM per vblank without extending vblank, so long as you don't have all characters changing their cel at the same time. For example, you can animate five characters independently at the Disney-standard 12 fps.
Re: za909 "helpme" - thread
I have read around, now that I've also got the controller working, and can fill CHR-RAM at a tile-level, I've asked my friend who could put together carts for me if this actually becomes a thing... and apparently I could have 256k EPROMs available, I'm not sure though if the UNROM donors natively support bankswitching a 256k ROM due to the 4-bit latch, or they do not even physically have that bit, leaving UOROM donors to work with without replacing the mapper logic to support 16 banks instead of 8.
Re: za909 "helpme" - thread
Discrete logic boards are so very simple that you should seriously look into making them new instead of reworking donors.
That said, all UxROM boards always have a 74'161 and a 74'32 and could be modified into supporting 256KiB by rewiring four lines. And even if you do start with a donor, your field of options is a lot larger; you can add a 74'161 and 74'32 (since both should be quite cheap) to basically anything with CHR-RAM (and a CIC if you're in NES-land instead of Famiclone-land).
(Also, make sure you got 256 KiB of EPROM or EEPROM, not 256 Kibit)
That said, all UxROM boards always have a 74'161 and a 74'32 and could be modified into supporting 256KiB by rewiring four lines. And even if you do start with a donor, your field of options is a lot larger; you can add a 74'161 and 74'32 (since both should be quite cheap) to basically anything with CHR-RAM (and a CIC if you're in NES-land instead of Famiclone-land).
(Also, make sure you got 256 KiB of EPROM or EEPROM, not 256 Kibit)
Re: za909 "helpme" - thread
Ok I've been making my new system functions quite confidently. But now that I'm getting around to designig my metatile sytem for the background I need to ask this: What are the pros and cons of different data formats, having hard coded collison maps vs. Generated ones in RAM. Also what kinds of data sizes should be expected when planning my data budget. Currently I have one fixed bank, one sound bank, two graphics banks for bg and sprites, and another one with 4kB tile animations, palette data and any common routines that don't have to take space from my fixed bank.
Re: za909 "helpme" - thread
So I've been thinking, and for a cutscene I will probably want to show a scrolling background (just with the two nametables repeating over and over again) and show text at the same time, so I'd have to split the screen with a sprite 0 hit. But when do I need to do that? I read that accessing $2000 during rendering with vertical mirroring can cause problems, or is that completely gone if I turn rendering off? Getting a few lines of blackness would be fine because the text field would be a black rectangle at the bottom anyway. Or do I need to time setting the new scroll in HBlank? If so, which PPU cycles are actually during HBlank? From 257 and onward or what?
And I've also been thinking about my own way of detecting the system region. Is it a good idea to select a one-shot NMI handler for the first NMI which then does a loop to burn more CPU cycles than the length of an NTSC VBlank, but less cycles than a PAL VBlank, and then check if the VBlank flag has been cleared or not?
And I've also been thinking about my own way of detecting the system region. Is it a good idea to select a one-shot NMI handler for the first NMI which then does a loop to burn more CPU cycles than the length of an NTSC VBlank, but less cycles than a PAL VBlank, and then check if the VBlank flag has been cleared or not?
Re: za909 "helpme" - thread
I wouldn't worry about that too much. This bug was only discovered quite recently, and it only happens sometimes, and in the worst case you'll just get 1 glitchy scanline before the scrolls catch up again with the value you wrote. If you write to $2000 near the end of the scanline, you'll have nothing to worry about.I read that accessing $2000 during rendering with vertical mirroring can cause problems, or is that completely gone if I turn rendering off?
On the other hand, turning the rendering off and on again is more complex and more sensitive to bugs, gotchas or NTSC/PAL differences.
Writing to $2000 and $2005 to change the horizontal scrolling is (in my opinion) the simplest split-screen effect you can do.
The naming of cycles within a scanlines are arbitrary and if I remember well there is different conventions. Using Nintendulator if I remember well HBlank is between 256 and 341. In all cases, the best is to do it by trial and error (adding or removing nops before the register writes), using an accurate emulator such as Nestopia and Nintendulator, and then verify on real hardware (if you're more patient you could test directly on hardware).If so, which PPU cycles are actually during HBlank? From 257 and onward or what?
Re: za909 "helpme" - thread
Here's the code I use to distinguish among NTSC, PAL NES, and Dendy systems.za909 wrote:And I've also been thinking about my own way of detecting the system region. Is it a good idea to select a one-shot NMI handler for the first NMI which then does a loop to burn more CPU cycles than the length of an NTSC VBlank, but less cycles than a PAL VBlank, and then check if the VBlank flag has been cleared or not?