Espozo wrote:
I mean, I guess I'm not one to talk as I don't know enough about the 68000, but that seems a little exaggerated? In the case of an actual game from the time period, not when you're trying to do software 3D rendering which makes use of 32 bit operations and multiplication/division (Even the video hardware on the SNES is less suited for 3D due to the graphics format).
....
Now uneven. But as I said, it's not always needed, so in the case where you're only doing 32 bit moves, it's definitely faster. However, you could also make a program that only does rts's

. I honestly just want to know why you feel the 68000 is 2x as fast as a 65816 at half the clock frequency, as you've said you've worked with both (and I've seen you code for the 68000).
As you may notice i said
I barely estimate the 7.67 Mhz 68000 to be almost twice as fast than the 3.1 Mhz 65816 (so with fast ROM).
So i didn't said the 68000 is twice as fast than a 65816 working at half the frequency (which would mean the 68000 is as fast than a 65816 running at the same frequency). I said *almost* twice as fast than a 3.1 Mhz 65816... that is again just pure estimation but i think a 7.67 Mhz 68000 is equivalent to a ~5.5 Mhz 65816.
And now i can explain you why that is my estimation. If you really want to use these CPU at their maximum you will always tend to unroll and use very simple instructions in your code to execute bottleneck in an optimal way.
A good example is the psycopathicteen sprite rotation code but you have tons of possible examples (polygon fill, unpacking code, collision check...). At this point then you will get closer and closer to the maximum "data processing rate" of these CPU where you can almost reduces the view to sort of "read / modify / write" operation.
In which case you will obtain sort of:
Code: Select all
move.l (a0)+,d0
add.l d1, d0
move.l d0,(a1)+
for the 68000 which give you 12+12+8 = 32 cycles to process 32 bits (1 cycles per bit, practical :p)
and for the 65816 :
which give 5+3+6 = 14 cycles to process 16 bits (a bit less than 1 cycle per bit).
Of course you have to consider this as a very rough estimation but still it gives an idea and i think your understand the point. The 68000 also gives you some advantages as you don't have to deal with page crossing / boundary stuff but the 65816 has other advantages as fast branching.
Given these numbers you can see from where come my estimation. Of course it's not only numbers but also my experience working with many different CPU.
Even if the SNES is literally half the speed of the Genesis, if it's easy to get 80 sprites on the Genesis, then it should be pretty easy to get 40 sprites on the SNES (and it is), but for some reason tons of programmers have problems moving more than 4 or 5 sprites. It's like there's a book or something teaching people to program the SNES in a very discreet method that limits the programmer to 4 or 5 sprites.
I totally agree with that and i think your Alesha demo is a good example. Still you have to consider how much time you spent in your optimization process, you just couldn't ask every developer to push optimizations to that level during all development stages, it is very time consuming (and so very costly). Also even having a sprite engine capable of handling metasprite and resources allocation and being flexible enough to handle every kind of sprite is definitely not that easy. It becomes really hard when you want it to be really fast and handling many sprites at once. In SGDK my current sprite engine is very slow and you can observe slowdown with only 10 sprites (and that is on MD) ! Ok it's wrote in C and i'm doing resource allocation in a very lazy (and slow) way but i really understand that some games used that kind of engine because they did not had time to do a better one. I'm currently rewriting my Sprite Engine (still in C) to obtain better performance but it became very complex now and really for me it's quite difficult to offer a good and flexible engine and provide good performance at same time :-/