Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
User avatar
merehap
Posts: 9
Joined: Mon Jan 17, 2022 8:35 am

Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by merehap »

I'm getting tired of writing hacks and heuristics in my emulator to determine the submapper and other NES2.0 metadata that isn't present in iNES ROMs. It decreases code clarity and increases bugginess. So I'm feeling drawn to using a ROM database so I can untangle the code for these conflated code paths. But using just a NES2.0 header database has its own drawbacks.

So what's the best way to handle the limitations of the old iNES format and the fact that most ROMs are in this format without NES2.0 metadata?

My understanding so far is that these are the advantages and disadvantages of each approach:

Heuristics/hacks only
+ No large upfront coding cost
+ Works on ROMs that aren't in any database
- Greatly decreased code quality (lower clarity, higher bugginess from lots of adhoc branching and flags)
- Increased amount of time to code a given mapper
- No way to determine fully correct behavior in cases where the submapper can't be determined (e.g. VRC2 copy protection vs VRC2/4 WRAM vs absent WRAM)

ROM NES2.0 header database
+ Greatly simplify some mappers (like VRC2/4 or mapper 1) by separating the code for different submappers
+ Unambiguous submapper behavior and restrictions for ROMs in the database, mappers are never in an unknown state
+ Can identify/run ROMs even when they have inaccurate header metadata
+ Individual mappers are quicker to write / less time writing branches, flags and hacks
- Higher upfront cost for coding up the database feature.
- Some ROMs can't be run properly since they aren't in the database and so there is no way to determine what submapper they belong to
Nestopia seems to mostly use this approach.

ROM NES2.0 header database, but fallback to heuristics when a ROM isn't in the DB
+ Most of the same benefits of header DB with no fallback: Simple mapper code/no ambiguous mapper states/bad ROM metadata is ignored
+ Can use heuristics for ROMs that are not in the DB, so all ROMs can be run (the other options can't run all ROMs in practice)
- Higher upfront cost for coding up the database feature
- Same poor code clarity/bugginess as the heuristics-only option, there's no way to remove the hacks since they are needed for fallback
- Same increased amount of time to code a given mapper as the heuristics-only option (due to keeping flags/branches/hacks)
Mesen seems to use this approach, and allows you to ignore the DB if you wish.

What do you all think is the solution for writing the best emulator in the long run? Have I missed some advantages and disadvantages to the different approaches?

Follow-up question: what's the best open source-licensed ROM header database to use? Nestopia's (which is presumably GPL)?
User avatar
Dwedit
Posts: 4911
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by Dwedit »

I've seen a few bad NES 2.0 headers that specify the wrong emulated system. So just because you have NES 2.0 doesn't mean the headers are necessarily good.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Drag
Posts: 1609
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by Drag »

Figuring out the correct NES 2.0 header for a given ROM image should be the responsibility of ROM validation tools, and not the emulator's responsibility.

The emulator should simply use the NES 2.0 header in the ROM file, falling back to the older iNES header (and mapper specification) when not available.
NewRisingSun
Posts: 1506
Joined: Thu May 19, 2005 11:30 am

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by NewRisingSun »

merehap wrote: Tue Jan 30, 2024 8:34 pm Follow-up question: what's the best open source-licensed ROM header database to use? Nestopia's (which is presumably GPL)?
NES 2.0 XML Database
tepples
Posts: 22702
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by tepples »

[This may make slightly less sense in light of a post that was deleted.]

In the case of NES 2.0, there is one correct header for each combination of cartridge board behavior and memory sizes, and one cartridge board for each set of PRG ROM and CHR ROM contents. Thus whether it's even possible to enforce copyright in a ROM database depends on the country. Some countries recognize a sui generis "database right" or apply a "sweat of the brow" standard to a work. Others, such as the United States and Brazil, have statutory or case law stating that when there's no room for creativity in selection, arrangement, or presentation of the data, there's no room for copyright in a database. See, for example, the opinion of the Supreme Court of the United States in Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340 (1991).
User avatar
merehap
Posts: 9
Joined: Mon Jan 17, 2022 8:35 am

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by merehap »

Drag wrote: Thu Feb 01, 2024 11:23 am Figuring out the correct NES 2.0 header for a given ROM image should be the responsibility of ROM validation tools, and not the emulator's responsibility.

The emulator should simply use the NES 2.0 header in the ROM file, falling back to the older iNES header (and mapper specification) when not available.
Does this mean Nestopia and others that use a DB for the emulator itself are actively hurting the ecosystem? What practical harm results from an emulator referencing a header DB? Enabling sloppy ROM-creators? I assume you are referring to some ill beyond abstract uncleanliness / lack of separation of concerns.
Drag
Posts: 1609
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by Drag »

merehap wrote: Sat Feb 03, 2024 9:31 pm Does this mean Nestopia and others that use a DB for the emulator itself are actively hurting the ecosystem? What practical harm results from an emulator referencing a header DB? Enabling sloppy ROM-creators? I assume you are referring to some ill beyond abstract uncleanliness / lack of separation of concerns.
If emulators need to outsource important data which is supposed to be in the ROM file, then ROM files no longer contain everything necessary for an emulator to configure itself to run its contents. The file is now a suggestion instead of a declaration, and I don't think that's good.

If an emulator author would like to autocorrect the files that are opened, then they're welcome to do so, but it won't solve the core problem of "there are ROM files with unsatisfactory headers", only work around it while it continues.


This is not "all databases are harmful", and this is not "emulators that validate against a database are harmful".
NewRisingSun
Posts: 1506
Joined: Thu May 19, 2005 11:30 am

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by NewRisingSun »

One way to solve the dilemma would be that in the event of a database-header mismatch, the emulator does not silently correct, but shows a nag window offering the user to adjust the ROM file according to the database.
User avatar
Dwedit
Posts: 4911
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by Dwedit »

Nagging on every NES 1.0 header that doesn't indicate WRAM size would be a non-starter.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
merehap
Posts: 9
Joined: Mon Jan 17, 2022 8:35 am

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by merehap »

Drag wrote: Sun Feb 04, 2024 3:14 am
merehap wrote: Sat Feb 03, 2024 9:31 pm Does this mean Nestopia and others that use a DB for the emulator itself are actively hurting the ecosystem? What practical harm results from an emulator referencing a header DB? Enabling sloppy ROM-creators? I assume you are referring to some ill beyond abstract uncleanliness / lack of separation of concerns.
If emulators need to outsource important data which is supposed to be in the ROM file, then ROM files no longer contain everything necessary for an emulator to configure itself to run its contents. The file is now a suggestion instead of a declaration, and I don't think that's good.

If an emulator author would like to autocorrect the files that are opened, then they're welcome to do so, but it won't solve the core problem of "there are ROM files with unsatisfactory headers", only work around it while it continues.


This is not "all databases are harmful", and this is not "emulators that validate against a database are harmful".
Honestly, "ROM files no longer contain[ing] everything necessary for an emulator to configure itself to run its contents" is the reason I made this post. There appears to be no sane way to implement mapper 210, for example, when a submapper isn't provided (Mesen's crazy implementation is what I'm going off of for my own implementation, but it clearly hasn't been fully tested since its nametable mirroring values for Namco 340 are out of order). VRC2/4 mappers are also bad, just not as bad.

Submapper numbers were presumably invented because the original mapper numbers weren't good enough for emulators to configure "[themselves] to run [ROM] contents", but most ROMs haven't been updated to include them.

I still want to treat each ROM as a declaration, I don't want to override any metadata. I just want to extend what is there when NES2.0 metadata is missing.

Apologies if I'm still missing your point and thanks for your patience.
NewRisingSun wrote: Sun Feb 04, 2024 7:16 am One way to solve the dilemma would be that in the event of a database-header mismatch, the emulator does not silently correct, but shows a nag window offering the user to adjust the ROM file according to the database.
I like this idea. But my goal here wasn't really to fix header mismatches, it was to fill in missing metadata (the submapper number, mostly). Extending the info available seems less controversial than overriding existing info.

So, synthesizing everything that I've learned in this thread, is the following the best solution?
1. Do not silently override the metadata of any ROM.
2. Show a nag prompt if header values in a ROM mismatch the ROM database. Provide an option to override, or even overwrite, the metadata present in the ROM.
3a. For iNES ROMs without NES2.0 headers, silently look up the submapper number in a header DB (probably look up other info like RAM size, but still do not override any iNES headers).
4a. Fail in running a ROM that the submapper can't be determined for (i.e. there are multiple possible submappers, but there's no NES2.0 header and no entry in the header DB).

Alternatively:
3b. Same as 3a, but show a nag prompt and only look up submapper if the user agrees. If they don't agree, then fall back to using hacks and heuristics.
4b. Succeed in running a ROM that the submapper can't be determined for by falling back to hacks and heuristics.

The advantage of 3a/4a is that all hacks and heuristics can be removed. Weird edge case bugs disappear, code quality and legibility skyrockets. The disadvantage is there may be some ROMs that are unplayable because they neither have NES2.0 headers nor are in the DB. But they could presumably be added to the DB.
The advantage/disadvantage of 3b/4b are the opposite of 3a/4a: more obscure ROMs are playable, but code quality, legibility and buggy edge cases cause suffering.

I'm leaning towards 3a/4a at this point. But I don't know for sure because I don't know how many ROMs are floating around that don't have a submapper number that can't be easily determined. It does feel better to me to just submit new entries to the NES 2.0 XML Database as necessary rather than complicate everything code-wise to accommodate obscure ROMs.

Drag: do you still prefer 3b/4b at this point? Or some entirely DB-less option?
Last edited by merehap on Sun Feb 04, 2024 3:19 pm, edited 1 time in total.
Drag
Posts: 1609
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by Drag »

I'm sorry, I think I was being harsh with my opinion. :P

The ideal situation would be for outdated ROMs be taken out of circulation and be replaced with updated ROMs with more accurate headers.

I agree with you that the emulator shouldn't have to do tricks to deal with ambiguous mappers from before NES 2.0, and that aligns with my belief that the emulator "should" be allowed to trust the ROM file's header, and that these tricks mean we can't and that's bad.

If those tricks can be replaced with a NES 2.0 header lookup based on a hash (or something), then I agree with you, that's a better way to deal with it versus relying on tricks, especially if the iNES mapper is particularly messy.

If there are iNES mappers which are adequate without any tricks needed (although maybe not as accurate), then I don't see a reason to throw those away, and I think that's where I was getting hung up.

With that being said, I think it's OK for an emulator to flatly reject a ROM file if it's one of those problematic mappers and if it cannot be reconciled, because I wouldn't expect that to affect a large amount of ROMs out there, and those ROM files should really be replaced anyway.

TL;DR: You convinced me, I don't have objections anymore. :D
User avatar
merehap
Posts: 9
Joined: Mon Jan 17, 2022 8:35 am

Re: Best emulator approach for the long-run: ROM metadata database? Heuristics? Both?

Post by merehap »

Thanks for the discussion, Drag! And thank you Dwedit, NewRisingSun, and tepples!

Looks like I'll be going with the 3a/4a approach, and that that is the ideal approach for serious emulators.
Post Reply