INL-ROM custom MMC3 hybrid mapper design

Discuss hardware-related topics, such as development cartridges, CopyNES, PowerPak, EPROMs, or whatever.

Moderators: B00daW, Moderators

lidnariq
Posts: 10677
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: INL-ROM custom MMC3 hybrid mapper design

Post by lidnariq »

infiniteneslives wrote: If we were to use something like this for the bi-annual homebrew compo you could just sell the new SPI chip and users could swap it out with their cart.
Which is a pretty good argument for considering using (Micro)SD, actually. More expensive, but reprogrammable by almost everyone.
User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by infiniteneslives »

lidnariq wrote:
infiniteneslives wrote: If we were to use something like this for the bi-annual homebrew compo you could just sell the new SPI chip and users could swap it out with their cart.
Which is a pretty good argument for considering using (Micro)SD, actually. More expensive, but reprogrammable by almost everyone.
True, although significantly more difficult to design/implement with effectively requiring a mcu. For the SPI flash I 'merely' need to toss a shift register with proper controls into my CPLD. Going though the work with an mcu I'd rather have USB connectivity with the mcu to reprogram the SPI flash especially since I've already got most of that work done with the NESDEV1, just need to swap to SPI vice parallel. A USB socket is also cheaper than microSD socket and card. Having USB would have the added benefit of making game development less cumbersome, and not necessarily slower if the whole ROM didn't need to be programmed. Plus if you only wanted to publish a game with this setup you wouldn't have to include the added cost of the mcu, socket, and flash card.
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by infiniteneslives »

Okay, so going with the assumption that this is a go, I'd like to try and come up with an idea of how the data exchange might go so I can try to come up with the hardware design. Sorry for the HUGE post, I was basically board this evening and decided to type this up while I thought through everything.

So we've got an issue of where ROM/RAM/mapper registers are all located. I figure the best way is to keep MMC3 registers untouched which would require all loading/writes to PRG memory to occur while the memory is mapped to $6000-7FFF only. So you'd map the 8KB bank there by some variant of MMC3 prg bank control the prepare to load it up with your game data.

So the SPI byte can't very easily be mapped to $6000-FFFF then if we want to keep the data exchange simple. To keep from requiring too many more address inputs we could place it at $5000-5FFF with the addition of PRG A12 as an input. So the most recent read byte would be there until the mapper was 'signaled' to read the next byte.

Now to figure out how to handle the commands and addresses sent to the SPI via the mapper. For anyone interested in details this is the data sheet I'm looking at for winbond 2MB SPI flash.

So I was trying to figure out someway to make it so that writes to the SPI wouldn't be serial so that writing 8bit instruction and 24bit address wouldn't be so slow. But I after some looking into things I don't think it's worth spending the logic to keep writes from being serial. I figure start with bare bones essentials, then if additional things are needed/desired we'll consider adding and weighing the trade off between CPU time and mapper logic. Additionally this keep things independent of what type of SPI you're using. Basically the mapper doesn't care how big it is, how large the pages are, whether it's EEPROM or flash etc. So even if someone were interested in this for something small like save data alone the mapper doesn't care. Emulator authors you're on your own I guess... Good news is there are data sheets for this stuff and the commands and such are pretty universal.

So for anyone unaware or not interested in reading the data sheet the SPI flash is pretty simple I'll spell out the basics. You write a 8 bit command followed by the address if applicable. For reads you just continue to clock the chip and it spits out data bit by bit, byte after byte on each clock until you disable it by taking /CS high. Similarly for writes you just continue to write the data you'd like to save, assuming you set things up properly and erased the page in flash and everything before hand. Once you're done with the long stream of reading/writing you take /CS high to finish the process. To start another access you take /CS low and repeat the process with the next command, address, data etc. Trust me though, if you want to write anything to the chip from the NES you'll have to look through the data sheet. If you're just reading data the discussion below is probably enough.

I figure the best way to signal the mapper to read the next byte is to write to a control register. But conveintly we've also got PRG A0 as an input, so I figure we'll have two 'SPI registers' at $5000 and 5001 (more specifically: $5xxEVEN and $5xxODD in normal MMC3 style). Here are the definitions I'm thinking:

-----------------------
$5000 "SPI WRITE" All writes to this register are fed directly to the SPI flash. This is where you can write commands and data directly to the SPI flash. Only PRG D7 is seen by the SPI flash. Here is where you'll have to give the read command followed by the address before data can be pumped out by the mapper. You'll also have to write save data here serially bit by bit (like controllers but writes). Don't forget you'll have to supply the write command followed by the SPI address you want to save data to. This is ALSO where you'll read full bytes from that the mapper will pump out for you.

-----------------------
$5001 "SPI READ/Mapper command" So we need to use this register to enable and disable the SPI flash by controlling the /CS pin on the chip. I figure we'll just use D7. So writing any value with D7=0 enables the SPI flash and disables it when D7=1. Additionally this is the register to use to command the mapper to fetch the next byte from the SPI flash so you can read it out in one full byte. So for now we'll say writing any value with D7=0 commands the mapper to fetch the next byte. Writing D7=1 will disable the flash and stop the read data stream.


So I basically need 9 CPU clock cycles to fetch each byte. one cycle per bit and one more to reset my control circuitry and set things up to pump out the next byte. I *COULD* cut this down to 8 cycles and ABSOLUTELY REQUIRE 8 cycles no more no less. Basically I'd only clock in 7 bits and the 8th bit wouldn't be clocked into the shift register, it'd just be placed on PRG D0 for the required READ on the 8th cycle. I don't think this is very user friendly though, and could easily cause data to be read improperly. In my loop I insert a NOP to make an 8 cycle STA/LDA cycle stretch out to 10 cycles.



TL;DR:

Here would be the sequence of operations to read data from the SPI flash:

1) Write to $5001 with D7=0 to enable the SPI flash. (while this does reset my pump/shift register control circuitry this intial command WILL NOT count for the command to fetch the FIRST byte.)

2) Write serially to the SPI flash via $5000 bit PRG D7. Things are written MSB first. So you'll have to write 03h for the read command followed by the 24bit address.

3) Now write to $5001 with PRG D7=0 to give the 'read full byte' command to the mapper.

4)Wait 9 clock cycles. You can do anything during this time except read/write to $5xxx. In a loop this is where I store the data from the previous read.

5)Read the first/next byte from $5000.

6) read next byte by looping back to step 3.

7) when DONE reading, write to $5001 with PRG D7=1.

Now you can save yourself some CPU time with step 7. Basically if you know the next stream of data you'd like to read is sequential from your current read you can just let the mapper and flash sit there idle. You could then come back 5mins later and read the next byte in the stream. Maybe the best way is to just leave it enabled after the read. Then if before you start your next read/write cycle you decide if you need to disable, enable, and issue another command.

Here is the code I wrote up as an example obviously there may be better ways to do this. But this should explain how it all works.

Code: Select all

;;;;;;copy SPI to $5000-$5FFF routine;;;;;;;;
;first you must place the desired PRG RAM bank at $5000-5FFF via the MMC3 style control registers. (details later)

LDY #00		
STY $5001	; Writing to $5000 with D7=0 enables the SPI flash for access. (takes /CS low)

;Now you must serially write to the SPI via $5000 bit 7.  the read command (03h) followed by the 24bit address, MSB first.

;Start unloading data now that everything is set up!
LDY #00		
STY $5001	;command to read the FIRST byte (with D7=0 still)
LDX #$00	;2cyc; set up loop counter and provide 2 cycle delay for SPI data pump
NOP			;2cyc;
NOP			;2cyc; need total of 9 cycles to setup pump timing for entry to loop
NOP			;2cyc;
NOP			;2cyc; okay it's been 10 cycles since STA $5001, enter loop
load_spi_to_wram:   ;copies 8KB bytes from SPI flash into page at $6000-7FFF
	LDA $5000		;mapper places most recent flash read at $5000 (decoded by PRG A0,12-15)
	STY $5001		;command to mapper to fetch next byte
	STA $6000, x	;store first byte that was read
	NOP				;provides at least 9 cycle delay from STA $5001
	LDA $5000		;read byte
	STY $5001		;fetch command
	NOP				;delay
	STA $6100, x	;store byte
	LDA $5000		
	STY $5001		
	NOP				
	STA $6200, x	  
	LDA $5000		;4cyc
	STY $5001		;4cyc
	NOP			;2cyc
	STA $6300, x	;4cyc
	...
	LDA $5000
	STY $7F00, x
	INX				
	BNE load_spi_to_wram	
	
;;end the read stream if you know your next SPI access isn't going to be a sequential read.
LDY #$80
STY $5001	;writing to $5001 with D7=1 disables the SPI flash (takes /CS high)

	;;;14cyc per byte * 8192bytes = ~115K cycles / 29800 = ~3.8 NTSC frames
Obviously you wouldn't have to do an entire 8KB loop, but assuming I haven't made too many mistakes that should work I'd think. Additionally it's require your data to be arranged in the correct order on the SPI flash to support this non-sequential copy loop. Maybe you guys can come up with a better solution/loop. I just did this to sort things out for myself. Copying data to pattern tables is even easier with just repetitive read $5000, delay, write $2007 loop.
Last edited by infiniteneslives on Sun Sep 16, 2012 10:58 am, edited 2 times in total.
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
lidnariq
Posts: 10677
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: INL-ROM custom MMC3 hybrid mapper design

Post by lidnariq »

infiniteneslives wrote:So I basically need 9 CPU clock cycles to fetch each byte. one cycle per bit and one more to reset my control circuitry and set things up to pump out the next byte.
Could you use the NES's 21/26MHz clock source? I guess the down side is that It's not famicom/famiclone compatible. A crystal/resonator (digikey:HWZT-12.00MD,12MHz,28¢/1)? Or use both edges of of M2 somehow? Winbond's large SPI EEPROMs can be clocked at up to 104MHz so there doesn't seem to be a relevant upper bound. Or can you use the winbond quad/dual SPI modes?

On the other hand, you can't really beat 224kB/s and you're still talking about aggregate read speeds of 200kB/s so whatever.
User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by infiniteneslives »

lidnariq wrote:
infiniteneslives wrote:So I basically need 9 CPU clock cycles to fetch each byte. one cycle per bit and one more to reset my control circuitry and set things up to pump out the next byte.
Could you use the NES's 21/26MHz clock source? I guess the down side is that It's not famicom/famiclone compatible. A crystal/resonator (digikey:HWZT-12.00MD,12MHz,28¢/1)? Or use both edges of of M2 somehow? Winbond's large SPI EEPROMs can be clocked at up to 104MHz so there doesn't seem to be a relevant upper bound. Or can you use the winbond quad/dual SPI modes?

On the other hand, you can't really beat 224kB/s and you're still talking about aggregate read speeds of 200kB/s so whatever.
Yeah I considered most of those things actually. I also thought about doing something like using a RMW instruction and direct reads from the SPI and writes to $6000. So LARGE unrolled loop could conceivably do it in 6 cycles (~290KB/s) with a lot of trickery, complexity, logic expense, I/O, components, etc. Like yourself, I realized it was plenty fast anyways so none of it's really justified.

Super simple, super cheap, plenty fast, tons of ROM so I'm happy ;).
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by tepples »

Writing the saved game serially might be a little slow, but I guess players expect that.
User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by infiniteneslives »

tepples wrote:Writing the saved game serially might be a little slow, but I guess players expect that.
Slow for the CPU, but pretty quick for the player. I'm used to a few seconds on everything but the NES which is take no time with battery backing. So even if you somehow managed to come up with 256bytes of save data (a full page of flash) It'll still be under a frame's worth of time.

I wrote up the routine quick, keep in mind it could be even faster if you unrolled each byte. Here I figured worst case looping on each bit.

Additionally I think I'm going to change my mind about D0 being connected to the SPI for the $5000 register. The SPI handles everything MSB first. So instead of hassling with rolling the MSB around to the LSB it just makes more sense to connect D7 to the SPI flash data input.

Code: Select all

;;;;save data to SPI routine;;;;
;this routine writes a full page of SPI flash
;before running you must erase the page 
;and write the page program command (02h)and 24bit address
;alternatively you could load the command and address into your 'save_data' array:
		;02h, addr4, addr3, addr2, addr1, save data (251 bytes)
;then this routine would give the program page command, address, and save_data all at once

LDY #00      
STY $5001   ; Writing to $5000 with D7=0 enables the SPI flash for access. (takes /CS low)

LDX #$00
write_to_SPI: 
	LDA save_data, X	;4cyc; load byte
	LDY #$08			;2cyc; bit counter
	save_byte:
		STA $5000		;4cyc * 8; write MSB to SPI (only D7 is connected)
		ASL A			;2cyc * 8; move bit 6 to D7
		DEY			;2cyc * 8;
		BNE save_byte	;3cyc * 7 + 2cyc last;
	INX				;2cyc; increment byte counter
	BNE write_to_SPI	;3cyc

LDY #$80
STY $5001   ;writing to $5001 with D7=1 disables the SPI flash to end the write (takes /CS high)

	;TOTAL time: ~100 cycles per byte = ~25.6K cycles = ~1frame
At a glance unrolling the byte loop would take around 55 cycles making it twice as fast which is around 32KB/sec. I guess if you wanted to be safe and read the data back, verify every byte then it'll take longer obviously. You could read back and compare all in one loop with only a few instructions so it's still not going to take more than a frame or two. And 256 is a lot of save data, you don't have to program the entire page if you don't have that much data.
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
3gengames
Formerly 65024U
Posts: 2281
Joined: Sat Mar 27, 2010 12:57 pm

Re: INL-ROM custom MMC3 hybrid mapper design

Post by 3gengames »

I'm sure with good game design, you can make it seemless. Palette fade? Save on top of it since the game won't be playing. Screen switches? Write a page. Save point in your game? Make it save and then a sound effect for the player to know. I'm sure you can find 1-10 frames in game play which you can reuse as a save point to seemlessly add it.
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: INL-ROM custom MMC3 hybrid mapper design

Post by tokumaru »

I really don't think this is an issue... People are used to messages like "saving the game, please don't turn the power off" being displayed for a few seconds and I honestly don't remember anyone complaining. And apparently this will fast enough to not even require a message. As long as it takes less than 1 second, I don't think you need a message.
zzo38
Posts: 1080
Joined: Mon Feb 07, 2011 12:46 pm

Re: INL-ROM custom MMC3 hybrid mapper design

Post by zzo38 »

tokumaru wrote:I really don't think this is an issue... People are used to messages like "saving the game, please don't turn the power off" being displayed for a few seconds and I honestly don't remember anyone complaining. And apparently this will fast enough to not even require a message. As long as it takes less than 1 second, I don't think you need a message.
If it takes longer than one frame, display a message anyways, just to make sure.
[url=gopher://zzo38computer.org/].[/url]
User avatar
thefox
Posts: 3139
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by thefox »

zzo38 wrote:
tokumaru wrote:I really don't think this is an issue... People are used to messages like "saving the game, please don't turn the power off" being displayed for a few seconds and I honestly don't remember anyone complaining. And apparently this will fast enough to not even require a message. As long as it takes less than 1 second, I don't think you need a message.
If it takes longer than one frame, display a message anyways, just to make sure.
Come on. It's actually BAD to display messages if the message won't be visible long enough for the player to see it properly.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
Kasumi
Posts: 1293
Joined: Wed Apr 02, 2008 2:09 pm

Re: INL-ROM custom MMC3 hybrid mapper design

Post by Kasumi »

Or use an icon, rather than a message. I know games that save extraordinarily fast, have a little SD card/floppy disk icon that appears in a bottom corner while it's saving.
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by tepples »

Yeah, sort of like the little floppy disk that would blink in the corner when Doom 1 would stall for loading. Super Smash Bros. Melee has a "saving" icon in the corner as well.
Grapeshot
Posts: 85
Joined: Thu Apr 14, 2011 9:27 pm
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by Grapeshot »

If you have the SPI flash and a shift register, would it be hard to add a way to stream PCM or DPCM audio from the flash to the expansion sound pin? I guess 1-bit PCM would be trivial (just connecting the LSB of the shift register to a spare CPLD pin and then automatically retriggering the read command) but for 4 or 8 bit audio you would need another latch and more spare pins on the CPLD.
User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: INL-ROM custom MMC3 hybrid mapper design

Post by infiniteneslives »

Grapeshot wrote:If you have the SPI flash and a shift register, would it be hard to add a way to stream PCM or DPCM audio from the flash to the expansion sound pin? I guess 1-bit PCM would be trivial (just connecting the LSB of the shift register to a spare CPLD pin and then automatically retriggering the read command) but for 4 or 8 bit audio you would need another latch and more spare pins on the CPLD.
That's an interesting thought... I had only considered storing DPCM samples on the SPI, loading them to RAM and playing. But I'd think something like you're imagining could be possible as well assuming an EXP audio jumper/resistors were installed.

You'll have to forgive me I'm not much of a sound buff but I am interested in the possibilities, so feel free to correct me on this stuff or suggest better solutions. I made the exp pins easily accessible by extending all the pins into the cart (don't have to chip away at the cart shell to access them) The CPLD that's going to handle the SPI flash should have a free pin that could be assigned to the task. Or if you were accepting of a 0-3.3v signal you wouldn't even need a CPLD pin, the SPI could be connected directly to the EXP pin.

So really I'd imagine doing it a little differently than using the SPI for game/save/graphics data. It could be set up to just run free, so after writing the command and address to the SPI via $5000 bit 7, reads would be automatically enabled (all this really means is the SPI needs to be continually clocked after the read cmnd/addr). And the SPI would just spit out the data stream until the chip was disabled by writing to $5001 with D7=1. You wouldn't even bother with the shift register, just let the flash stream bits on each clock pulse. I'm guessing 1.79Mhz would be a little faster than desired for an audio stream. Instead of a shift register a clock divider could be put in it's place.

I'd guess you'd also want a low pass filter and could easily locate than in the perf area.

If there was logic to spare both the shift register and clock divider could be implemented at once. I'd just have to add another definition to $5001. Perhaps something like D6=0 divided clock bit stream to EXP pin, D6=1 byte feeding as discussed previously. D7 would still enable/disable the SPI which would stop either bit stream or byte feed reads.

EDIT: it wouldn't be required, but might be nice. The SPI's hold pin/function would basically act like a 'pause' for the bit stream. So you could stop the stream and pick up where you left off if control was given to that pin. Perhaps by D5 on $5001. We'll see how much logic and pins are available, but if desired this could be considered as well.
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
Post Reply