Page 4 of 7

Posted: Sat Feb 27, 2010 3:36 pm
by bunnyboy
Doing that search for free space to put loader code is a good idea. I might try that if the $FF70 section is found to be used lots.

Posted: Sun Feb 28, 2010 11:36 am
by blargg
I typed up a nice Wiki page describing SPC uploading using the code I posted yesterday. I added more explanation of the various patches.

Now I'm running the code on a directory of 1200 SPC soundtracks, having it check for any that come up silent or don't start any new notes after the beginning (running on my SPC emulator of course). Should take a while.

Posted: Sun Feb 28, 2010 11:57 am
by bunnyboy
Wiki page looks great. Couple suggestions:

-Link to .c file on your server is bad. Can it be moved to the wiki?

-Add DSP register addresses. Yes they can be looked up elsewhere but would fit well in the table.

-Explain F0/F1 a bit more. I especially found almost no mention of the test register anywhere.

-A bit confusing: "Since we write DSP registers first, this delay will be taken care of by the time it takes to load the 64K RAM." If DSP regs are loaded first, the delay isn't from loading the RAM? Again maybe it makes sense and I am just missing something.


Can you also check your spc database for how many don't have the $FF70 space free, or don't have any FF strings?

Posted: Sun Feb 28, 2010 12:20 pm
by blargg
Feel free to edit the improve the Wiki page (one reason I made it such), including uploading the .c file. I'm not sure how I can add to Anomie's documentation of $F0 anf $F1. He covers both fully. I clarified the EDL description, hopefully.

Currently I have the big test log when it must use the echo buffer, and when even that isn't available and it must choose an arbitrary address. I can just write a separate scanner that goes over all the archives and checks the $FF70 space (file-extractor to the rescue), noting how many bytes are all $FF.

Posted: Mon Mar 01, 2010 3:12 am
by mic_
Can you also check your spc database for how many don't have the $FF70 space free, or don't have any FF strings?
I found one right away; Bishoujo Senshi Sailormoon. It does have some FF and 00 strings at other locations, but they're rather short so I'm not sure if they're padding or actual data.

EDIT: I found a few more:

Donkey Kong Country 3
Dracula X
Gokujou Parodius
Metal Morph (maybe)
StarFox
Star Ocean
Super Punchout

I tried these with my loader and none of them crash. None of them sound strange either, except maybe for the song I tried from Star Ocean (so-01.spc).

Posted: Mon Mar 01, 2010 6:30 am
by blargg
I just got a no-patch method working, where I execute the final code out of the four I/O registers (yes, it's as crazy as it sounds). The only modification to the SPC RAM is pushing three bytes (the final PSW and PC to jump to).

After you've loaded the DSP registers and 64K RAM, you tell the bootloader to start executing $00F5. The SPC bootloader code reaches this point, where it acknowledges to the S-CPU, then jumps to $00F5:

Code: Select all

MOV $F4,A       ; 4 send acknowledgement to S-CPU
MOV A,Y         ; 2
MOV X,A         ; 2
BNE Trans       ; 2
JMP [$0000+X]   ; 6 jump to $00F5
You have the S-CPU sit in a tight loop watching for acknowledgement. The moment that occurs, you write $2F $FE to the last two I/O registers. The SPC will be executing the NOP ad $00F5, then your new instruction, BRA *-2, a loop.

Now you have the SPC running a two-byte loop at $00F6. You then load a one or two-byte instruction into $00F4 (pad it with a NOP if only one byte), then write $FC to the fourth I/O register. This will change the loop to BRA *-4, so it'll execute the instruction you just wrote, over and over. Delay a bit so it has time to execute, then write $FE to the fourth I/O register again to stop executing the other instruction. Repeat for each instruction you want to execute. When you're done, execute RETI, which will restore PSW from the stack, then do RET. Currently, I execute 18 instructions (36 bytes) via this method to do the final restoration.

I tested the timing and it's very generous. I had the DRAM refresh delay (40 master clocks) occur at each timing for the S-CPU, and it doesn't bother it. I inserted delays at critical steps, and it took quite a bit to delay too much. The only critical part is getting it into the loop; when executing instructions, you just need to delay enough for it to execute at least once.

I didn't test whether the SPC reading at the same time the S-CPU is writing to an I/O register causes problems. I've read that it might OR the two values. I reasoned that it can't XOR them, as that would break the bootloader. I analyzed the above and ORing wouldn't break it. The only place the SPC would be reading a byte while the S-CPU is writing is the branch offset at $00F7. That will be either $FE or $FC, which differ only at bit 1. ORing these two yields $FE, which is fine for it to read in that case.

I'm still testing, but it's working pretty well. Sometimes songs don't play, but the same occurred for the old loader, so I'm not sure it has to do with this new method. I'll post more once I've banged on it more. I'm surprised I actually got this working.

Posted: Mon Mar 01, 2010 8:02 am
by mic_
High points for the absurd programming technique, even if it ends up fixing maybe a handful of games :)

Posted: Mon Mar 01, 2010 5:36 pm
by blargg
OK, after about 14 hours, I've got this working well. I wrote a version that runs entirely on the SNES from an UNPATCHED SPC. You give it a buffer with the 64K SPC data, a buffer with the 128 DSP registers, and a buffer with the first 64 bytes of the SPC header (they can all be in ROM). Then it uploads to the SPC-700 using this execute-from-I/O method.

I've also got the I/O technique working perfectly to where it doesn't even execute an instruction more than once, so you can run PUSH A for example and not have it push more than one byte. This was just too interesting a technique to not attempt to make solid and reliable.

One reason I wrote this SNES version is I figured bunnyboy might be able to use it in his SNES player (I'm assuming it runs on the S-CPU, and not something custom). Will post something tomorrow.

Posted: Mon Mar 01, 2010 8:05 pm
by bunnyboy
Bonus points for originality :) Are you still having the problems of songs sometimes not starting? Any significant advantage to this method compared to just patching the .SPC?

Posted: Tue Mar 02, 2010 5:29 am
by mic_
The reason why one of the songs from Star Ocean sounds a bit strange when I use my loader could be that it appears to use $FFC0..$FFFF as normal RAM. It has what appears to be actual data in that area, and that data does not match the IPL machine code sequence.

Posted: Tue Mar 02, 2010 7:30 am
by mic_
I wrote a brief article about the experiences from writing my SPC loader, for all you peoples from the future (won't anybody think of the children?!)

Posted: Tue Mar 02, 2010 11:27 am
by caitsith2
mic_: You may wish to include Terranigma in your test set. You will see why skipping the writing of FFC0-FFFF, and forcing register F1 to 8x is a bad idea.

Oh, and FF70 in that games SPC driver is USED. better implement a free space search algo, then use FF70 as a LAST resort.

Write 00 to both KON and KOFF dsp registers, and load the correct KON register in the bootloader.

Posted: Tue Mar 02, 2010 1:48 pm
by mic_
Ah, TCALL is evil. Won't be too hard to fix these problems though.

Posted: Tue Mar 02, 2010 3:16 pm
by blargg
Here's source to the no-patching version, that runs on the SNES: upload_spc_nopatch.zip. It came out quite short. The main benefit is that you don't have to modify anything (except the three bytes pushed on the stack). This also means you don't have to search for space to put the final code (simple enough when running on a PC, but tedious if the patching code itself has to run on the SNES). So far I haven't found any problems with this, though I have no way of automating testing on the SNES.

Posted: Thu Mar 04, 2010 7:43 am
by mic_
I've made some changes to my code so that e.g. Terranigma now works: http://jiggawatt.org/badc0de/spcplayer-1.2.zip

The way it works now is like this:

* Turn echo off in FLG, then init the rest of the DSP regs.

* Upload a small routine at $0002 in SPC RAM which inits $00, $01, KON, KOF, sets up $f1 so that $ffc0..$ffff is normal RAM, recieves all data between $00f8..$ffff at two bytes per iteration, and finally branches to the init routine.

* The init routine is placed at some address in SPC RAM which is determined in the following way:

# If EDL is non-zero, the address is set to ESA*$100 + $200. $200 is
added to avoid pages 0 and 1 in case ESA is 0.
# If EDL is zero the entire SPC RAM dump is scanned for a free chunk
of memory (one that contains a string of only FF or 00).
# If both of these fail, $ff70 is used as a last resort.

* The init routine sets up the SPC registers, FLG, recieves all data between $0000..$00f1, and finally branches to the SPC entry point.