Dragon Quest Disassembly

A place where you can keep others updated about your NES-related projects through screenshots, videos or information in general.

Moderator: Moderators

User avatar
segaloco
Posts: 239
Joined: Fri Aug 25, 2023 11:56 am
Contact:

Re: Dragon Quest Disassembly

Post by segaloco »

Sorry for the curt notification, I was still grumpy about something at the time. To be a bit more thorough, this is done as far as I can tell regarding addresses and text. There are still select "magic numbers" that are not intrinsically defined to their sources, for instance NPC string IDs are still just bytes, not a constant in terms of where the members in the array actually are at compile time. That's stuff that I intend to work on in the second pass of this while working in the Dragon Warrior stuff.

The crux of it all is the awk script in the util folder. Since ca65 doesn't support UTF-8 as far as I can tell, I went ahead and shunted that to an outside tool. I had to make a few odd choices to ensure the ROM still builds 1:1. For instance, there is one whole string in the strings file where someone inadvertently provided a handakuten as a period. In Japanese as I'm sure you know, both are a small circle, but one is added as accenting to ha-hi-fu-he-ho kana to change them to pa-pi-pu-pe-po whereas the other is used much like a conventional period. I could see someone making that mistake in a hurry, but yeah, as a result that one period uses another symbol in the Japanese codepage that also looks like an empty circle, but that's admittedly a kludge.

Another inconsistency is that there are two ways that accented kana are expressed in strings:

- In one approach, a special control suffix (in this case 0xF8 or 0xF9) is provided immediately following a kana needing decoration. The text engine then interprets this as meaning to apply a dakuten (0xF8) or handakuten (0xF9) to the immediately preceding kana.

- In the other approach, all accented hiragana characters are also provided in the "common strings" library meaning you can trim them to one byte in most strings. This is done inconsistently through the codebase but luckily cleanly along the lines that I separated files. To handle this at build time, there is a sed script that further converts the request for the single byte version in this case *to* the multi-byte one above if the file has the right extension. I used .j and .j2 (for japanese-text) but am not really married to an extension. I went with a different extension so I could use make(1)s inference rules for .j -> .s and .j2 -> .s rather than some temp file or changing other rules to subjectively handle the awk filtering.

One reason I see they *may* have avoided the string versions of the accented hiragana is that unlike the global strings library, the common strings library does *not* have the offsets of each string precalculated and staged in a table, the global strings do. In other words, global strings have the index stashed in a table somewhere with even alignment, so if you want string 3, you go look at index 3 in the table and that tells you the word offset from the base of the string library.

With the common strings, instead, it's an 0xFF-delimited list that every time a string index is passed, I'm pretty sure it has to traverse the entire collection until getting to that row, meaning your lookup time scales with how many and how long the strings are. I'm still a bit fuzzy on this but that's the gist I'm getting from the code now that it's all broken open. I haven't studied the text handling of the US release yet to determine how much similarity there is, but that was the area with the most differences when comparing stuff with existing US Dragon Warrior analysis, so I suspect there may be enough changes that I shunt the US and Japanese text handlers into separate conditionally/included files rather than putting little ifdefs in the middle of things, depends on how different though.

Either way, if you find yourself using this for anything and have any questions, I'm happy to illuminate the still shadowy parts. This has been a couple years in the making, it's satisfying to finally have what I believe is complete labeling and, aside from CNROM bank considerations, full relocatability of code and a high degree of relocatabilty of data without having to manually adjust pointers and indices all over the place.

The trickest thing to nail down to an assembler/linker generated value rather than bytes right in the code is that common string library. Since the strings are variable length and there is no offset table, there's no consistent spacing I could use for a constant that reduces down to each string index. I'm still puzzling on the best way to make the string IDs derive from their labels in code rather than just knowing which one is the first, the second, the third, etc. in the list.

Anywho, I added a section to the readme about how the text works, at least the code generation parts of it, so that should illuminate the reasoning behind anything you actually see in strings like control characters.
Pokun
Posts: 2651
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Dragon Quest Disassembly

Post by Pokun »

Yeah I remember the game also had two ways to represent dakuten/handakuten in the game. As a separate character when entering the hero's name and putting the diacritical marks between the rows in dialog (so that only every other row could be used for the text).

It's nice that you bothered implementing a system for handling strings in kana, it makes it much easier to read (provided you can fluently read kana) and search. Especially since, as you said, the text portions are where DQ and DW differs the most and deserves to be documented thoroughly.
It's too bad ca65 doesn't support UTF-8 though. That's another thing I like about 64tass and the fact that it allows defining things like control characters for use in strings, but ca65 is the more popular assembler for NES.

Definitely a fine job as deserved for one of the most important games in gaming history. I think DQ1 has SMB1 level of importance since it basically gave birth to the whole JRPG genre by merging Wizardry and Ultima and also popularized the genre beyond its nerdy roots.
User avatar
segaloco
Posts: 239
Joined: Fri Aug 25, 2023 11:56 am
Contact:

Re: Dragon Quest Disassembly

Post by segaloco »

I appreciate the comparison to SMB1, and I agree, that's largely why I chose Dragon Quest for examining an RPG engine. It's a genre-defining title that spawned not only its own legacy but innumerable other series and titles. Its importance to the proliferation of RPGs on home consoles cannot be understated.

My main reasoning behind using the cc65 suite is it just resembles the standard UNIX programming environment so much, so my practices that I use in other areas meld nicely without having to learn a whole bunch of different stuff just for my 6502-focused things. That also makes choices like this one more exportable, I could theoretically use this same awk(1) script with GNU as out of the box since .byte is also a directive over there. The sed script would need touchup since .dbyt is not. An alternative fix would be to change my constants so that the two-byte characters are expressed in reverse order, then I could use .word. Still, a lot less rewrite than, say, if I had relied on the intrinsic UTF-8 of some other toolkit, now I've got a home-grown solution that makes that not matter at all.
Post Reply