Page 2 of 2
Posted: Mon Jul 13, 2009 12:58 am
by Gekko
For #2..., I have the following estimations/exact numbers in terms of master clock cycles. (I figured them out from the numbers
here.)
Scanline: 1024
H-blank: 340
V-blank: 586 (31958 on 256x239 and 52418 on 256x224)
Correct me if I'm wrong, of course.
Posted: Mon Jul 13, 2009 5:40 pm
by Near
All scanlines are 1364 cycles long, with one exception:
NTSC-only with interlace off on scanline 240 on an odd field ($213f.d7=1) is only 1360 clock cycles. It has something to do with the NTSC color burst, I don't know exactly, but it really is short, crazy as it sounds.
Also, 40 clocks from every scanline are spent on DRAM refresh, so you effectively can only execute 1324/1320 clocks per scanline.
DRAM refresh occurs at H=530 on CPU revision 1, and it alternates between (starting at) H=538 and H=534 on each subsequent scanline (except that scanline that's only 1360 clocks, it's the same on that scanline as the scanline after that). DRAM refresh will supercede everything on the S-CPU side, even DMA.
NTSC has 262 scanlines per field, PAL has 312. Interlace even fields have 263 or 313 scanlines, respectively.
I know that (1364*262*60)-4 != exactly 315/88*1,000,000 (eg 21.477MHz). It's apparently close enough for TVs to display the image though.
Like I said, I'd really appreciate it if someone could run an oscilloscope on the S-CPUs of both NTSC and PAL consoles.
Posted: Mon Jul 13, 2009 9:03 pm
by tepples
byuu wrote:All scanlines are 1364 cycles long, with one exception:
NTSC-only with interlace off on scanline 240 on an odd field ($213f.d7=1) is only 1360 clock cycles. It has something to do with the NTSC color burst, I don't know exactly, but it really is short, crazy as it sounds.
The NTSC NES has similar behavior. Ordinarily, the PPU runs four master clock cycles per dot and 341 dots per scanline. But on every other field, if rendering is turned on, it skips one dot near the end of the "pre-render" scanline (y=-1). This dot is
not skipped in games like Battletoads that leave rendering disabled during the pre-render scanline.
Posted: Mon Jul 13, 2009 10:19 pm
by koitsu
Re: cycles and timing: I posted some details in the below thread which might be helpful, but byuu's mostly covered them. :)
http://nesdev.com/bbs/viewtopic.php?t=5367&start=15
Posted: Mon Jul 13, 2009 10:43 pm
by Near
This dot is not skipped in games like Battletoads that leave rendering disabled during the pre-render scanline.
Since you mentioned that, I should add that I've tested to see if that was the case with the SNES as well. It is not, that dot is always skipped no matter what. And to get more complicated, the two 'long dots' (6 clock cycles each instead of 4) of the PPU counters aren't 'long' anymore, so you still see dots 0-340 when you latch the counters. IRQs are based off the S-CPU's own counter, so they of course don't have any understanding of the long dots. Isn't the SNES fun? ;)
I believe blargg mentioned that on the NES Battletoads, that it messed with the 3-scanline 'dot crawl' effect of the video output. I could be mistaken, however.
Number of clock cycles per pixel: MODEs 5,6 == 2 cycles, MODEs 1-4,7 == 4 cycles
I see what you're going for, but I'm not sure that's what's really happening.
My belief is that both normal and hires (and indeed pseudo-hires) are one and the same. The differences are just in the frequency of tile fetches, and how the main and sub screens blend together to produce the output.
I'm afraid to test, but my theory is that it's possible to toggle pseudo-hires mid-scanline. If so, that will pretty much force internal rendering to happen at 512-width for the sake of all that is sane. Unless of course you want to write an HQ2x filter that can blend a screen where every single line is potentially a different width ;)
Of course, it's largely pointless to know for sure since it's theoretical. The analog output of the chip cannot be tested or emulated, at least to the extent that there is absolutely no reason to do so, as the difference cannot be observed by an SNES program.
Posted: Tue Jul 14, 2009 1:51 am
by Gekko
I thought that the pixels outside of the visible scanline counted as H-blank and the scanlines outside of the actual viewing range were counted as V-blank. Again, correct me if I'm wrong. (I forgot about DRAM refresh. I'm assuming that it would would cut 40 cycles out of H-blank and V-blank, though.) If I'm wrong, though, I think that would leave H-blank and V-blank with very few cycles.
Also, another question:
6) What is the clock speed of the Cx4 chip? How many cycles does it take to use each instruction?
Posted: Tue Jul 14, 2009 2:51 am
by koitsu
byuu wrote:Number of clock cycles per pixel: MODEs 5,6 == 2 cycles, MODEs 1-4,7 == 4 cycles
I see what you're going for, but I'm not sure that's what's really happening.
My belief is that both normal and hires (and indeed pseudo-hires) are one and the same. The differences are just in the frequency of tile fetches, and how the main and sub screens blend together to produce the output.
I'm afraid to test, but my theory is that it's possible to toggle pseudo-hires mid-scanline. If so, that will pretty much force internal rendering to happen at 512-width for the sake of all that is sane. Unless of course you want to write an HQ2x filter that can blend a screen where every single line is potentially a different width ;)
Of course, it's largely pointless to know for sure since it's theoretical. The analog output of the chip cannot be tested or emulated, at least to the extent that there is absolutely no reason to do so, as the difference cannot be observed by an SNES program.
My numbers come directly from the official developers manual, so if they're wrong, you know where to send complaints to. :)
Posted: Wed Jul 22, 2009 9:45 pm
by Gekko
The NES's V-blank is 6810 Ricoh 5A22 master cycles. I still need to know if that's consistent with the SNES's V-blank and what it would be for H-blank. (Also, this means that overscan is not part of H/V-blank.)
And now, for another question: Would using an indirect long mode that uses $2180 as its destination (i.e. LDA [$80] if the DP were $2100), would you get what would be the equivalent of [$[2181]] or would would it include $2181 and $2182 as the intermediate and high bytes, respectively?
Posted: Thu Jul 23, 2009 6:21 pm
by Near
I already mentioned exactly how many cycles per scanline and scanlines per frame there were. This was based off the 21MHz clock (315/88*6m). To get your numbers, just divide.
If you lda [$80] with D=2100, you will end up fetching whatever byte is at the address pointed to by $2181-2183, and then that value will be repeated for the next two reads. So say $2181-3 point to $7e0123, which contains #$5a. You'll end up reading from $5a5a5a. This is because $2181-3 are write-only. Reads return open bus, or the MDR (memory data register.) That register will be set upon the $2180 port read.
Posted: Thu Jul 23, 2009 7:16 pm
by Gekko
I know how to calculate what you think I'm trying to calculate.
What I'm actually trying to calculate is the length of H-blank and V-blank (in cycles). I am not trying to calculate the length of each scanline, pixel or whatever. I know that H-blank and V-blank are not overscan, so my previous numbers were wrong. In addition, that also means that they are not considered by your formula or they are in a part of a scanline, where I do not know the length (and the H-blanking and V-blanking (moving the opposite way) themselves are not actually calculated).
Oh right, they are open bus.... Well, at least there is an easy around not being able to use it as 16-bit/24-bit. Thank you!
Posted: Fri Jul 24, 2009 2:46 pm
by Near
What I'm actually trying to calculate is the length of H-blank and V-blank (in cycles).
Have you tried looking at emu sources for your answers? Eg:
src/cpu/scpu/mmio/mmio.cpp:mmio_r4212() states that 0-3 + 1096-1363 (1359 short scanline) are hblank, and lines 225-261 (262 interlace) are vblank. If overscan is enabled, Vblank starts at V=240.
272 clocks for 99% of Hblanks, 268 clocks for NTSC non-interlace scanline 240 on odd fields.
NTSC Vblank for non-interlace even: 50468 clocks.
NTSC Vblank for non-interlace odd: 50464 clocks.
NTSC Vblank for interlace even: 51832 clocks.
NTSC Vblank for interlace odd: 50468 clocks.
PAL has 312/313 scanlines, and it's a tossup whether overscan is on or not for these games.
PAL Vblank with overscan on non-interlace even: 30008 clocks.
PAL Vblank with overscan off non-interlace even: 118668 clocks.
Should be able to calculate the rest from here.
Posted: Fri Jul 24, 2009 3:02 pm
by tepples
byuu wrote:What I'm actually trying to calculate is the length of H-blank and V-blank (in cycles).
Have you tried looking at emu sources for your answers?
I thought we all learned not to do that near the end of the Nesticle era.
Posted: Fri Jul 24, 2009 5:14 pm
by Gekko
Ah, thank you!
Also, I would have but I don't know much C/C++ and since I can't find anything in x86 I'd have to use a disassembler which would take even longer since it would not only be unlikely for it to be disorganized, due to the way compilers work but it would also have a lot of junk bytes (if you don't believe me, look
here) and a lack of comments, making it very difficult. (Not to mention that I'm not extremely well acquainted with x86, anyway. Just more so than C/C++.) Well, I guess I'll try to learn C/C++ more quickly so I can hope to understand the data in emulator source code soon. Sorry for troubling you!
I guess this thread can be closed or whatever you do to threads that have served their purpose.
Posted: Fri Jul 24, 2009 7:34 pm
by Near
No trouble, just figure the sources would be quicker than waiting on us is all.
Posted: Thu Jul 30, 2009 10:10 pm
by Gekko
Oh, joy! I can't read it at all. I even tried BSNES's source code but I still got nothing out of it. Oh well!
Anyway, here is another question:
How does offset-per-tile mode work (particularly mode 6, although I presume that mode 2 is the same)? I know that you specify locations with layer 3 registers and the offset is for columns, not tiles. However, I want to know what gets displayed when you place one column of tiles on top of another one? In other words, if the offsets were all zero aside from one of them, other than the first, which was $10 (or would it be 8, since it uses half pixels?), what would show up on the column to the right of the normal position of the previously mentioned column?