[My emulator] Graphics glitches - SuperMarioBros

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Check the attributes behind "1 PLAYER GAME" and "2 PLAYER GAME". Can you dump VRAM $23C0-$23FF from your emulator and from a known good one?
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

Code: Select all

NameTable 0:
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$?????$$$$$$$$$$ ????$$????$$$
$$$??????$$.)??$$$$?(?$$$$$$$$$$
$$$$$DHHHHHHHHHHHHHHHHHHHHI$$$$$
$$$$$FÐÑØØÞÑÐÚÞÑ&&&&&&&&&&J$$$$$
$$$$$FÒÓÛÛÛÙÛÜÛß&&&&&&&&&&J$$$$$
$$$$$FÔÕÔÙÛâÔÚÛà&&&&&&&&&&J$$$$$
$$$$$FÖ×Ö×á&ÖÝáá&&&&&&&&&&J$$$$$
$$$$$FÐèÑÐÑÞÑØÐÑ&ÞÑÞÑÐÑÐÑ&J$$$$$
$$$$$FÛBBÛBÛBÛÛB&ÛBÛBÛBÛB&J$$$$$
$$$$$FÛÛÛÛÛÛßÛÛÛ&ÛßÛßÛÛäå&J$$$$$
$$$$$FÛÛÛÞCÛàÛÛÛ&ÛãÛàÛÛæã&J$$$$$
$$$$$FÛÛÛÛBÛÛÛÔÙ&ÛÙÛÛÔÙÔÙçJ$$$$$
$$$$$_??????????x?????????z$$$$$
$$$$$$$$$$$$$Ï????$????????$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$Î$?$???"??$????$$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$?$???"??$????$$$$$$$$
$$$$12$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$0&43$$$$$$$$$$$$$$$$$$$$$$$$$
$$0&&&&3$$$$???($??????$$$$$$$$$
$0&4&&4&3$$$$$$$$$$$$$$$676767$$
0&&&&&&&&3$$$$$$$$$$$$$5%%%%%%8$
´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ
¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·
´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ´µ
¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·¶·
Attribute table 0:
AA AA AA AA AA AA AA AA 
00 55 55 55 55 55 55 55 
55 55 55 55 55 55 55 55 
55 55 55 55 55 55 55 00 
00 00 99 AA AA AA 00 00 
00 00 00 00 00 00 00 00 
50 50 50 50 50 50 50 50 
05 05 05 05 05 05 05 05 
Not sure how to get a reference from a good emulator, will have to look into it.
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

I seem to remember that the Windows version of FCEUX has a hex viewer for ROM, RAM, and VRAM.
User avatar
thefox
Posts: 3139
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Re: oddly green

Post by thefox »

Bisqwit wrote:That is, all reads/writes to palette indexes, whether internally or through I/O, are routed through the following map:

Code: Select all

 00 => 0   08 => 0     10 => 0    18 => 0
 01 => 1   09 => 9     11 => 11   19 => 19
 02 => 2   0A => A     12 => 12   1A => 1A
 03 => 3   0B => B     13 => 13   1B => 1B
 04 => 0   0C => 0     14 => 0    1C => 0
 05 => 5   0D => D     15 => 15   1D => 1D
 06 => 6   0E => E     16 => 16   1E => 1E
 07 => 7   0F => F     17 => 17   1F => 1F
This is wrong. $4, $8 and $C should not be mirrored down to 0 when reading/writing to/from the palette memory, only when rendering (if you want to think about it that way). $10, $14, $18 and $1C are mirrors of 0, $4, $8 and $C.

See http://wiki.nesdev.com/w/index.php/PPU_palettes for details.
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Re: oddly green

Post by Bisqwit »

Thanks for the help. Though I already mentioned it in IRC; I got it working.
Turns out my PPU rendering loop was changing the nametable address mid-frame even while background rendering was disabled.
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

I have another problem... My emulator causes the game to crash.
This nice 187 kilobyte animated screenshot illustrates the problem. Repeat frames were removed from the first part to bring the intro faster to motion.
Image
When the mushroom appears, the game crashes. I tried also running a TAS on it, and while the TAS (where it synced) did not invoke the mushroom, the game still crashed around the same spot.

(The image was stitched into a form that avoids global motion with my tool called "animmerger"; this makes the GIF smaller. However, it may appear in the end as if the screen jerks forward. This did not happen; the stitcher was just confused by the HUD suddenly disappearing as a result of the game's crash.)

I thought it would be sprite-0-hit related, but my emulator passes all Blargg's sprite hit tests... Yet the game still crashes.

This is interesting, because e.g. Rockman 1 plays just fine (and syncs with the TAS exactly as long as the real console does).
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

SMB1 tends to crash if sprite 0 doesn't line up. Check all writes to $2000 and $2006 to make sure rendering starts at the correct nametable.
beannaich
Posts: 207
Joined: Wed Mar 31, 2010 12:40 pm

Post by beannaich »

I'd also make sure you check the contents of your OAM and $2003, sprite 0 reordering might also be causing SMB to hang.

Do you pass the sprite overflow tests?
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

beannaich wrote:I'd also make sure you check the contents of your OAM and $2003, sprite 0 reordering might also be causing SMB to hang.

Do you pass the sprite overflow tests?
Yes, all OAM related tests. Thanks for the suggestions, I'll look into them.
3gengames
Formerly 65024U
Posts: 2281
Joined: Sat Mar 27, 2010 12:57 pm

Post by 3gengames »

Are you clearing the sprite 0 hit only when rendering begins? I'd believe the test ROM's would fail for that but you never know, good luck.
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

3gengames wrote:Are you clearing the sprite 0 hit only when rendering begins? I'd believe the test ROM's would fail for that but you never know, good luck.
I initially cleared it at the same time as I clear the vblank flag, but in order to make the "sprite_hit_tests_2005.10.05" test "09.timing_basics" test "9) Cleared at end of VBL too late" pass and not fail, I changed it into cycle 340 of the last vblank line: 2 cycles before the vblank flag is cleared, 1 cycle before the pre-render scanline begins.
The vblank clearing time was established to be 1 cycle after the end of vblank in order to make "ppu_vbl_nmi" test "03-vbl_clear_time" pass. (The CPU emulator passes all timing tests, including the cyclewise disassembly trace of nestest.nes, so it's not that the tests get wrong timing.)
Changing the clearing time did not affect the game either way.

The sprite hit flag is not cleared at any other time. It is also not cleared by a read of any port.

The OAM address (which is used for the sprite-prescan for next scanline at ppu scanline cycles 0..255) is cleared at cycle 0. In addition, at the processing of sprite 2 it is set to 8 as is done by nestopia. (Both of these only happen if sprite rendering is enabled.)

From the selection of "nestest", "instr_test-v3", "instr_misc", "branch_timing_tests", "cpu_timing_testv6", "oam_read", "oam_stress", "ppu_vbl_test", "ppu_open_bus", "sprite_hit_tests_2005.10.0", "sprite_hit_timing", my emulator currently fails only two tests:
-- "ppu_vbl_nmi" test "07-nmi_on_timing": I get two N lines rather than 5.
-- "instr_misc" test "04-dummy_reads_apu": APU is not implemented yet.
In addition, "ppu_sprite_overflow" seems to produce a number of fails, curiously, including an unexplained complaint about wrong VBL timing (despite the passing of ppu_vbl_test).

Here is how I do Vblank and NMI currently:

CPU:
  • - All memory accesses are synchronous with the PPU: a memory-write and memory-read both incur an immediate three PPU cycles before the I/O is performed, regardless of the type of memory accessed. The same goes for extra tick() calls incurred by certain opcodes that need them to ensure proper timing.
    - At the beginning of opcode fetch (before the opcode is fetched), the NMI line is checked and saved into a variable.
    - After the opcode is fetched (and PPU has allowed to run for 3 cycles), the just saved nmi variable is checked. If a rising edge was detected (i.e. it was up and it was not up the last time it was checked), the fetched opcode is discarded, and replaced with BRK instead. NMI processing begins. (Though a BRK opcode is processed, special conditions ensure that the vector is loaded from $FFFA and that the flags pushed are ORred with #$20 rather than with #$30. The return address pushed to stack is also calculated properly for NMI.)
PPU (the following operations are tested/performed in the listing order):
  • - At the beginning of every cycle, a bitwise AND of the NMI enable flag ($2000 bit 7) and the Vblank flag ($2002 bit 7) is pushed into an internal queue of NMI states. The third element of the queue is popped, and assigned to the NMI line polled by the CPU. This ensures that the CPU always receives the NMI flag at a two (or three?) PPU cycle delay. Doing the pushing before the next step also ensures that the "06-suppression" test passes, among others.
    - At the beginning of every cycle, an internal variable called VBlankState is checked. If it was 1, the VBlank flag ($2002 bit 7) is set. If it was -1, the $2002 register is set to #$00 (which clears the VBlank flag). After these tests, VBlankState is set to 0.
    - At the beginning of the 0th cycle of the 241st scanline (the first vblank scanline, after the one idle waste scanline that follows the rendering), VBlankState is set to 1. This is the internal flag.
    - At the beginning of the 0th cycle of the -1th scanline (pre-render scanline), VBlankState is set to -1. This is the internal flag.
    - When $2000 is written to (by the CPU), no special processing happens aside from storing to the register.
    - When $2002 is read from (by the CPU), the VBlank flag is cleared. If VBlankState happened to be 1, it is also cleared.
A particular aspect of this design is that I do not use cycle counters / deadlines (i.e. "run until N cycles / if cycles < 100, then return"). Everything happens completely synchronously. Well, as synchronously as it can happen. The cycles for the CPU and PPU are still interleaved. A read from $2000 causes 3 PPU cycles, followed by an access to the PPU register, rather than the read coinciding with the PPU code. During the processing of an LDA $2000, the PPU runs for 12 cycles in total. The PPU register access happens right after the 12th PPU cycle. (I could not find a test that tells which cycle the access should happen on.)
Can someone point out what exactly I am doing wrong that causes the two NMI and VBL related tests to fail? (In addition to possible further hints towards solving the Mario crash.)
Near
Founder of higan project
Posts: 1553
Joined: Mon Mar 27, 2006 5:23 pm

Post by Near »

Bisqwit wrote:During the processing of an LDA $2000, the PPU runs for 12 cycles in total. The PPU register access happens right after the 12th PPU cycle. (I could not find a test that tells which cycle the access should happen on.)
I've noticed the significant omission of this data from all documents. It's quite annoying.

When a device reads from or writes to another device, it requires time to pass before the read/write actually occurs. When two devices are supposed to do something at the exact same time, either a conflict occurs or one takes priority. This information is completely missing in NES documentation.

On the SNES, each clock cycle is 6, 8, or 12 clocks long. Reads against the PPU happen at total_clocks-4, and writes at total_clocks (eg after the PPU has run the same amount of time as our opcode.)

Internally, the behavior is that the data is there the entire time, but has to be sitting on the bus with /RD or /WR for the right amount of time before it is acted upon.

Right now, my best guess for NES is that, assuming all chips are at an equal time, CPU > APU > PPU. And a CPU read/write happens before PPU runs. If CPU accesses PPU $2007 during rendering, who the hell knows what happens. It's guessed that it will read/write whatever the PPU fetched last, but it's never explained.
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

http://bisqwit.iki.fi/src/nesemu1_vbl_test_skeleton.cc
Here is a link to my V-Blank / NMI timing test skeleton, stripped of all features not related to V-Blank / NMI timing testing (370 lines remain). It can be used to run Blargg's tests. Note that it does not include any graphical / audio output. It outputs only to the console. Lacking any mapper functions, it only supports the "rom_singles" versions.

byuu, changing the tick() to occur _after_ read() or write() requires changing the NMI delay buffer length from 3 elements to 6 elements to prevent test pass rate going worse. I find this unlikely to be correct...
beannaich
Posts: 207
Joined: Wed Mar 31, 2010 12:40 pm

Post by beannaich »

Just from the sounds of it, I think there is a small timing error somewhere in your PPU. I clear the entire contents of $2002 at the beginning of scanline -1 (dot 0), and I pass all the relevant PPU tests.

One thing I was doing wrong, was in my $4014, after I added the cycles for sprites (513 CPU cycles), I didn't catch the PPU up immediately, this caused a few dot error, that was driving me crazy trying to figure out.

How are you handling $4014?
Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Post by Bisqwit »

beannaich wrote:How are you handling $4014?
In the way shown above. When a write to $4014 is encountered, 256 reads and writes will be issued, each consuming one cpu tick (three ppu ticks). The write() call will therefore last 256*2+1 = 513 cpu cycles total (instead of the normal 1 cpu cycle), plus the additional time required by the opcode (opcode and operand fetches (3 cycles), possible indexing and possible misfiring (2 cycles)). These cycles are also done synchronously with the PPU. So no, that is not the reason either.

Can you look at the source code I provided (or just the algorithm description in the preceding post) and point out where the timing error is?
Post Reply