93143 wrote:byuu wrote:The SNES could have been a beast had they included a NEC uPD7725 with program and data RAM (for per-game upload of firmware) instead of ROM.
I've looked up the datasheet for the μPD77C/P25, and now I'm wondering why the DSP-1 took so long to do stuff. According to the datasheet a 16x16 signed multiply is just one of several things that can all happen in one cycle, but the SNES manual lists that same multiply as taking 26 cycles. The datasheet says 2.58 μs for a sin/cos, but the SNES manual says 7.8 μs at about the same clock speed. Is there really that much overhead involved in getting this chip to do something on demand?
Yes, yes there is. The uPD7725 has no way of implementing anything like a jump table. A significant portion of the program ROM in the DSP-1 is dedicated to command decoding, which has to be done with a tree of test-and-branches on each bit of the command in turn. That's why the DSP-1 has so many mirrored commands, and the more important commands have more mirrors than the less important ones.
ETA: Here's what command decoding in the DSP-1B looks like (comments added, obviously) You can count for yourself how many cycles it takes to decode each command.
Code: Select all
000: 97c000 jrqm 000
001: c10007 ld 0400,sr
002: c02006 ld 0080,dr
003: c03002 ld 00c0,b
004: 97c010 jrqm 004
005: 128081 mov dr,a
and dr,b
006: 91800c jnzb 003
007: c00007 ld 0000,sr
008: 0b0000 shr1 a ; xxxxxxx*
009: 9040a0 jca 028 ; xxxxxxx1
00a: 0b0000 shr1 a ; xxxxxx*0
00b: 904080 jca 020 ; xxxxxx10
00c: 0b0000 shr1 a ; xxxxx*00
00d: 904060 jca 018 ; xxxxx100
00e: 0b0000 shr1 a ; xxxx*000
00f: 90404c jca 013 ; xxxx1000
010: 0b0000 shr1 a ; xxx*0000
011: 9006b0 jnca 1ac ; xxx00000 0x00, 0x20
012: 9046cc jca 1b3 ; xxx10000 0x10, 0x30
013: 0b0000 shr1 a ; xxx*1000
014: 904784 jca 1e1 ; xxx11000 0x18, 0x38
015: 0b0000 shr1 a ; xx*01000
016: 900740 jnca 1d0 ; xx001000 0x08
017: 9046f4 jca 1bd ; xx101000 0x28
018: 0b0000 shr1 a ; xxxx*100
019: 904074 jca 01d ; xxxx1100
01a: 0b0000 shr1 a ; xxx*0100
01b: 9007d0 jnca 1f4 ; xxx00100 0x04, 0x24
01c: 904800 jca 200 ; xxx10100 0x14, 0x34
01d: 0b0000 shr1 a ; xxx*1100
01e: 9008fc jnca 23f ; xxx01100 0x0c, 0x2c
01f: 904940 jca 250 ; xxx11100 0x1c, 0x3c
020: 0b0000 shr1 a ; xxxxx*10
021: 904094 jca 025 ; xxxxx110
022: 0b0000 shr1 a ; xxxx*010
023: 9009ec jnca 27b ; xxxx0010 0x02, 0x12, 0x22, 0x32
024: 904d80 jca 360 ; xxxx1010 0x0a, 0x1a, 0x2a, 0x3a
025: 0b0000 shr1 a ; xxxx*110
026: 900e8c jnca 3a3 ; xxxx0110 0x06, 0x16, 0x26, 0x36
027: 905068 jca 41a ; xxxx1110 0x0e, 0x1e, 0x2e, 0x3e
028: 0b0000 shr1 a ; xxxxxx*1
029: 9040b8 jca 02e ; xxxxxx11
02a: 0b0000 shr1 a ; xxxxx*01
02b: 0b0000 shr1 a ; xxxx*x01
02c: 901120 jnca 448 ; xxxx0x01
02d: 905224 jca 489 ; xxxx1x01
02e: 0b0000 shr1 a ; xxxxx*11
02f: 9040cc jca 033 ; xxxxx111
030: 0b0000 shr1 a ; xxxx*011
031: 90128c jnca 4a3 ; xxxx0011
032: 9052f4 jca 4bd ; xxxx1011
033: 0b0000 shr1 a ; xxxx*111
034: 0b0000 shr1 a ; xxx*x111
035: 9053a8 jca 4ea ; xxx1x111 0x17, 0x1f, 0x37, 0x3f
036: 0b0000 shr1 a ; xx*0x111
037: 90133c jnca 4cf ; xx00x111 0x07, 0x0f
038: 9053c0 jca 4f0 ; xx10x111 0x27, 0x2f