$2002 is a read only PPU register and reading it has some effects on the PPU, which is why Nintendulator's debugger doesn't display it correctly, because reading it from the debugger might cause side effects and there is no extra read-with-no-side-effects function for the debugger to use.
If the tests are looping on $2002 that would indicate that the test is done and it's trying to print the results of the test to the PPU. I'm not sure how that is happening if you don't see any output written at $6000 since that should happen first.
edit: actually I went ahead and made a log of test 03 to compare with what you are seeing, and it looks like the test is checking for the existence of a PPU by looping for a few frames and seeing if that flag ever changes. You can ignore differences with Nintendulator on that part of the test and move on to line 27,000 or so of the log. (See why I said you needed a robust diff tool to analyze these?)
The place where you were actually failing the test is around line 35,000 where this block of code is executed:
Code: Select all
E373 A0 LDY #$07 PC:E374 A:BF X:8C Y:00 P:27 SP:9C CYC:109 SL:244
E375 84 STY $29 PC:E376 A:BF X:8C Y:07 P:25 SP:9C CYC:115 SL:244
E377 B9 LDA $E222,y PC:E378 A:BF X:8C Y:07 P:25 SP:9C CYC:124 SL:244
E37A 8D STA $03A1 PC:E37B A:FF X:8C Y:07 P:A5 SP:9C CYC:136 SL:244
E37D A6 LDX $21 PC:E37E A:FF X:8C Y:07 P:A5 SP:9C CYC:148 SL:244
E37F 9A TXS PC:E380 A:FF X:90 Y:07 P:A5 SP:9C CYC:157 SL:244
E380 B9 LDA $E222,y PC:E381 A:FF X:90 Y:07 P:A5 SP:90 CYC:163 SL:244
E383 8D STA $03A1 PC:E384 A:FF X:90 Y:07 P:A5 SP:90 CYC:175 SL:244
E386 A5 LDA $1D PC:E387 A:FF X:90 Y:07 P:A5 SP:90 CYC:187 SL:244
E388 48 PHA PC:E389 A:00 X:90 Y:07 P:27 SP:90 CYC:196 SL:244
E389 A5 LDA $1E PC:E38A A:00 X:90 Y:07 P:27 SP:8F CYC:205 SL:244
E38B A6 LDX $1F PC:E38C A:FF X:90 Y:07 P:A5 SP:8F CYC:214 SL:244
E38D A4 LDY $20 PC:E38E A:FF X:00 Y:07 P:27 SP:8F CYC:223 SL:244
E38F 28 PLP PC:E390 A:FF X:00 Y:01 P:25 SP:8F CYC:232 SL:244
E390 4C JMP $03A0 PC:E391 A:FF X:00 Y:01 P:20 SP:90 CYC:244 SL:244
03A0 A9 LDA #$FF PC:03A1 A:FF X:00 Y:01 P:20 SP:90 CYC:253 SL:244
03A2 4C JMP $E393 PC:03A3 A:FF X:00 Y:01 P:A0 SP:90 CYC:259 SL:244
E393 08 PHP PC:E394 A:FF X:00 Y:01 P:A0 SP:90 CYC:268 SL:244
Basically what's going on here is that the test is copying some data to RAM that was generated previously in the test, and then jumping to its location. The data itself happens to be a jump to the next portion of the test. So either your CPU can't execute code from RAM, or the wrong data is being copied to this location at some point, which could be caused by a bug thousands of lines previously in the log.