Possible Disassembler Release (seeing if anyone cares)

You can talk about almost anything that you want to on this board.

Moderator: Moderators

Post Reply
Celius
Posts: 2159
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Possible Disassembler Release (seeing if anyone cares)

Post by Celius »

So a couple days ago, I wanted to do a disassembly of Castlevania, and I really didn't want to dink around with Tracer and all the command line crap, so believe it or not I started disassembling it "by hand" (I looked at each instruction in a hex editor and translated it myself). Then I realized that was a stupid waste of time, so I decided I wanted to make a disassembler to fit my needs with Visual Basic.

I also realized that the best way to go about disassembling a game is not to have a program do it all for you. Really only humans can tell what is data and what is not, so a sort of disassembling in chunks approach I thought would be best. I think that you should go through and logically determine what is data and what is code. For example, I noticed there was a table in CV where the index was ANDed by $0F before used. This leads me to believe that the table is 16 bytes long, and none of it is code. A good disassembler (in my opinion) can help you in this process, not necessarily do it for you though.

So here's a quick kind of buggy version of what I'm going for:

http://www.freewebs.com/the_bott/DisasmTest.rar

Hopefully that doesn't require anything special to run. I'm kind of a newb when it comes to programming things like this. It was made in VB, so I assume it runs with Windows only.

Basically here's what happens. You specify the path of the NES ROM, where in the file you want to start disassembling from, what the PC would be at that location, and you press "Next" to disassemble the next line(s). In a textbox in the bottom left corner, you specify how many lines to disassemble. The Readme explains a little more. Oh, but you have to press "Refresh" if you change the starting PC or the File Pointer (I really apologize for the really sloppy programming on my part).

There are multiple disassembly windows. One puts 6502 code in standard syntax. Just plain old should-work-in-every-assembler output. Then the next outputs all of that with the PC placed on every line (just for information, not directly assembling). Then the next outputs it all as data. The readme describes a little more.

So I'd appreciate it if you guys checked it out, and say whether or not you think it's useful/should be continued with expectations of releasing it. If it's worth while, I'd add support for more assembler specific needs. For example, I'd have it disassemble as lda [$xx],y for NESASM, stx $xxxx.w for WLA-DX, etc. if you tell it which assembler you use (also assuming the program supports it). And also I'd add some obvious stuff (saving features, not stupid bugs, etc.)

EDIT: Also please let me know if for some reason you can't get it to work. I'd like to know so I could take care of it.
ugetab
Posts: 335
Joined: Sat Oct 29, 2005 12:03 am
Contact:

Post by ugetab »

Why not mix this with FCEUXDSP logging files? I believe they differentiate between data and code on access(0x01/0x02, 0x03 for both?). If nothing else, it should benefit you with a bit of automation depending on how much logging you're willing to do.
NSFs I've ripped:
http://www.angelfire.com/nc/ugetab/

A Searchable list of NSFs from other sites. In Internet Explorer, go to Edit>Find (on This Page)...
http://www.angelfire.com/nc/ugetab/NSFList.txt
Celius
Posts: 2159
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Post by Celius »

Hmm, I'm not quite sure about the file formats for those yet, but I think that would be a good idea. One thing that would be a peice of cake (though it wouldn't compare to what the data logger does really) is make a list of all of the addresses in a certain range that are read. Though that would only cover a small amount of instructions, like LDA $Absolute. The reason I would consider this is if you don't want to play through the entire game to see exactly what's data and what's not (You'd still probably get a big chunk of data). But if you do, then definitely I could combine this with data logs.

I think (maybe I'm the only one, but who knows) that data logs + assembler-friendly disassembly with label output (that would require 1 complete regular disassembly) could make this a very useful disassembler. The main thing I want this to have is assembler friendly output. Though for some reason, WLA keeps telling me on BPL $FB that it's wrong, because it's out of 8-bit range (Stupid).
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Assemblers treat "BPL $FB" as "BPL $00FB", an address in page $00. Unless you're doing something tricky such as generating a multiplier at runtime that uses the top of zero page and/or the bottom of stack, this is almost certainly out of range. If something is PC-relative, you might need to use the label *, which in several assemblers refers to the program counter. Does "BPL *-$05" assemble to what you expect? Otherwise, you can put a label with an arbitrary name on any instruction to which a branch refers.
Celius
Posts: 2159
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Post by Celius »

The assembler says that the value $FB is out of 8 bit range. The error is different from a branching distance being too large. It will say "fix references; branch distance is too large (173 bytes)" Or something like that. This one is implying that I'm giving it a number that is bigger than 8 bits, so it does see it as a literal branch definition. All assemblers should handle standard 6502 syntax. If I say:

LDA $2002
BPL $FB

That means I'm waiting for a Vblank to pass. BPL is a 2 byte instruction, where the number following is an 8-bit signed number. So there should be no problem giving it a literal definition.

It turns out now that I try it in order for it to work well without a label definition that I have to give it a value in decimal instead of hexadecimal. So saying:

LDA $2002
BPL -5

Works out just fine. I'll have to apply that to my disassembler. Though I think using labels in the future will make for more understandable disassembly.
Celius
Posts: 2159
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Post by Celius »

Okay, I have an updated version which works with FCEUXD CDL files :) .

http://www.freewebs.com/the_bott/DisasmTest2.rar


Also, I cleaned it up a bit so it's more presentable. I've included the CDL file and a header-stripped ROM that goes along with it. If you want to check it out, just change the file paths to point to the files I included, hit "refresh", and disassemble for a little bit. You'll see that it can determine a lot of the code from data.

I've also added the option to disassemble unaccessed data ($00s in the CDL file) as either data or code. The next thing this needs is a little less of a duct taped user interface, and some Assembler Specific support. That will be easier to implement than labeling.
Post Reply