PPU timing problem with Donkey Kong (SOLVED)

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
radicant
Posts: 5
Joined: Wed Aug 03, 2022 12:49 pm

PPU timing problem with Donkey Kong (SOLVED)

Post by radicant »

My CPU code passes nestest.nes and if I do direct lookups in the pattern table then the DK loading screen and demo screen backgrounds are exactly right. However, moving on to implementing the frame timing diagram on the wiki results in stuff like this:
Screen Shot 2022-08-09 at 10.38.05 PM.png
I tried copying PPU clock, register write, and register read code from another working emulator and got almost same result (minor palette differences) so I think it's something with the CPU-PPU coordination.

I'm guessing that there could be a vblank/NMI problem where the game modifies the Loopy registers during visible scan lines, but I don't think I've found that yet.

Any other things to check? Thanks!
Last edited by radicant on Thu Aug 11, 2022 6:47 pm, edited 1 time in total.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: PPU timing problem with Donkey Kong

Post by lidnariq »

That's the kind of bug I expect to see in hardware, not software...

Somehow you're not reading from the address byte you think you are (the bottom 5 bits are good, the rest aren't), or you're not writing to the address byte you think you are. You're displaying the 14th, 18th, 26th, and ??th rows of the nametable, repeated three times. Are you implementing nametables as four separate 32x30 nametables, or one huge 64x60 one? If you just display a dump of the nametable memory, what does it hold?
radicant
Posts: 5
Joined: Wed Aug 03, 2022 12:49 pm

Re: PPU timing problem with Donkey Kong

Post by radicant »

I've implemented the name tables as one big 2048 byte chunk. The data in them is correct because I can read and build the sprites directly (no PPU emulation) and the screen is right. Things go wrong once I start emulating the PPU clock cycles.

I found one dumb issue that surprisingly didn't change things at all: I had the 3:1 PPU:CPU clock ratio going, but I didn't let the PPU tick while the CPU cycles were happening. Essentially every CPU instruction acted like "one cycle."

The first scanline is starting at ~$2040 as the VRAM address which is two tiles down vertically, so that explains part of the problem. I imagine DK should start at $2000 or $2400.
radicant
Posts: 5
Joined: Wed Aug 03, 2022 12:49 pm

Re: PPU timing problem with Donkey Kong

Post by radicant »

Wow, can you see the problem?

Code: Select all

    typedef union {
        u16 raw;
        struct {
            u8 coarseX         : 5;
            u8 coarseY         : 5;
            u8 nametableX      : 1;
            u8 nametableY      : 1;
            u8 fineYScroll     : 3;
            u8 _               : 1;
        };
    } PPUScrollRegister;
The bitfields need to be u16...
20210401_harold_fb-358955159.jpg
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: PPU timing problem with Donkey Kong (SOLVED)

Post by Dwedit »

8 bit address bus HYPE
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: PPU timing problem with Donkey Kong

Post by Ben Boldt »

radicant wrote: Thu Aug 11, 2022 6:47 pm

Code: Select all

    typedef union {
        u16 raw;
        struct {
            u8 coarseX         : 5;
            u8 coarseY         : 5;
            u8 nametableX      : 1;
            u8 nametableY      : 1;
            u8 fineYScroll     : 3;
            u8 _               : 1;
        };
    } PPUScrollRegister;
The bitfields need to be u16...
Can anyone explain technically why this is? None of the groups of bits are individually more than 8 bits so I am not clear the actual reason why that doesn't work.

I have also heard in some ancient coding standard that bit order (or possibly byte order) is not guaranteed the same for all compilers/processors when breaking down bitfields this way. Is that actually true?

edit:
Is the 'u8' in the example trying to control how the bits pack? For example, it does coarseX into 1 u8. That uses 5 bits, leaving 3 bits left, so it has to go to the next u8 for the 5-bit coarseY. With 3 bits left again, the next 2 1-bit variables do fit into that one after that, leaving 1 bit left. Then fineYScroll (3-bits) again doesn't fit so it goes to a 3rd byte... Is this correct how it works?

It's amazing; I use these a lot and never ran into this. I always use "unsigned" out front, which I believe just means "unsigned int", in other words the native word size of the processor. And almost always I use 1-bit flags, so maybe they always pack the same regardless...
Joe
Posts: 650
Joined: Mon Apr 01, 2013 11:17 pm

Re: PPU timing problem with Donkey Kong

Post by Joe »

Ben Boldt wrote: Sun Aug 14, 2022 4:15 pmI have also heard in some ancient coding standard that bit order (or possibly byte order) is not guaranteed the same for all compilers/processors when breaking down bitfields this way. Is that actually true?
Yes:
C standard draft N2176 wrote:An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.
(There's similar language in the C++ standard too.)
Ben Boldt wrote: Sun Aug 14, 2022 4:15 pmIt's amazing; I use these a lot and never ran into this. I always use "unsigned" out front, which I believe just means "unsigned int", in other words the native word size of the processor. And almost always I use 1-bit flags, so maybe they always pack the same regardless...
If you're only using single-bit bitfields, you only have to worry about the order in which the bits are packed, so you probably just haven't run into any implementations that pack them the other way around. (I expect packing order usually corresponds to endianness, and all the popular computers are little-endian nowadays.)
Post Reply