Assembly Refactoring Tools?

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

Post Reply
User avatar
segaloco
Posts: 568
Joined: Fri Aug 25, 2023 11:56 am
Contact:

Assembly Refactoring Tools?

Post by segaloco »

Is anyone aware of any refactoring tools specifically tailored towards assembly, 6502 in particular, ca65 syntax in very particular, but I'm interested in others too just to see what precedent is for tools like this.

Reason I ask is I've written up a few shell scripts and other bits that I use in my disassembly process to rapidly update bytes into labels and equates for the purposes of generating memory maps from disassemblies. Basically a two pass thing, one pass, find any binary token that represents a memory reference (zero page and absolute operands, but not immediates) and then once the list of such tokens is determined, spit out a mapping and replace all those tokens with symbolic names.

Granted it doesn't catch everything, you can only tell from code when a set of immediates being written to adjacent memory locations constitutes a pointer, or heck when a single byte immediate is being used to put a zero page reference somewhere (e.g. lda $00, x where x was set previously to a zero page variable).

Either way, thoughts, is anyone aware of such tools? This is one of the few areas in my analytical process in which I still think back fondly on some of the functionality in graphical tools like IDA Pro and Ghidra. Filling this gap effectively I think would round out my understanding of how to pull off pretty much everything I did in those tools without once touching a graphical desktop environment.
User avatar
t1lt
Posts: 50
Joined: Sun Oct 01, 2023 7:40 am

Re: Assembly Refactoring Tools?

Post by t1lt »

Dang would love some better tools for this myself was just battling with dasmx this very day! :oops:
Oziphantom
Posts: 1918
Joined: Tue Feb 07, 2017 2:03 am

Re: Assembly Refactoring Tools?

Post by Oziphantom »

The only refactoring tools I know for 6502 are the "move code" operands in monitors. Basically asm is not really refactor-able its more a "rewrite" it. There is no, just rename this variable in all the code, other than find and replace, as there is no scope in asm so every label will be unique.. except for something in 64tass were you start to use scope, but historically it is a very modern problem and only about 3 assemblers in the world support it. Even then it will mostly be contained to a single file and there won't really be enough usage of said scope that filtering by pressing replace or next by hand is too much of a burden.

I have though about "pull this code into a function" might be nice, i.e make a selection and it cuts the code, invents a label, puts jsr <said label> and then moves up or down the file to find a spot after an RTS to put said code, but even then, it will be wrong 25% the time and doesn't really save me much.

Seems what you want is static Regenerator? but on a command line so it will give you labels of the form aXX/XX pXX/XX fXX/XXX sXXXX label types, does da65 not give you this? Probably be faster to add it da65 than make external tools.Or alternatively as you work your code you could use a similar Hungarian notation or type last notation and then you assemble the file and get the listing. Again using ca65 is hurting you as ca65 won't give you a full listing, because linker.. while using 64tass would make life easier by giving you a full in place listing.

Code: Select all

;Line	;Offset	;Hex		;Monitor	;Source
6523	.2a67					buildReverseTable
6524	.2a67	a2 fe		ldx #$fe		ldx #$fe
6525	.2a69	a0 00		ldy #$00		ldy #0
6526	.2a6b	8a		txa		-	txa
6527	.2a6c	84 03		sty $03			sty ZPTemp1
6528	.2a6e	0a		asl a			asl a
6529	.2a6f	66 03		ror $03			ror ZPTemp1
6530	.2a71	0a		asl a			asl a
6531	.2a72	66 03		ror $03			ror ZPTemp1
6532	.2a74	0a		asl a			asl a
6533	.2a75	66 03		ror $03			ror ZPTemp1
6534	.2a77	0a		asl a			asl a
6535	.2a78	66 03		ror $03			ror ZPTemp1
6536	.2a7a	0a		asl a			asl a
6537	.2a7b	66 03		ror $03			ror ZPTemp1
6538	.2a7d	0a		asl a			asl a
6539	.2a7e	66 03		ror $03			ror ZPTemp1
6540	.2a80	0a		asl a			asl a
6541	.2a81	66 03		ror $03			ror ZPTemp1
6542	.2a83	0a		asl a			asl a
6543	.2a84	66 03		ror $03			ror ZPTemp1
6544	.2a86	a5 03		lda $03			lda ZPTemp1
6545	.2a88	9d 00 04	sta $0400,x		sta ByteReverseTable,x
6546	.2a8b	ca		dex			dex
6547	.2a8c	d0 dd		bne $2a6b		bne -
6548	.2a8e	a2 ff		ldx #$ff		ldx #$ff
6549	.2a90	8e ff 04	stx $04ff		stx ByteReverseTable+$ff
6550	.2a93	e8		inx			inx
6551	.2a94	8e 00 04	stx $0400		stx ByteReverseTable
6552	.2a97	60		rts			rts
this allows you to search for a label, and then you can look up the raw address of it, the addressing mode/instruction and even the source line that references each thing in a nice simple single pass with a single source of truth. You could also annotate the source code with .sections which when you assemble it, which is always possible as it doesn't have the `assembler needs an explicit point to handle random data problem but shouldn't set an address because that is the linkers problem` of a assembler/linker combo, the output will then give you a broad memory map of what is what type and named.

Code: Select all

Gap:         101   $0002-$0066   $0065    zp
Data:        470   $0801-$09d6   $01d6
Data:        597   $09d7-$0c2b   $0255    sMainGame
Data:        537   $0c2c-$0e44   $0219    sBirdFSM
Data:        799   $0e45-$1163   $031f    sDogFSM
Data:        193   $1164-$1224   $00c1    sShotFlashFSM
Data:        924   $1225-$15c0   $039c    sBirdSubFuncs
Data:       1121   $15c1-$1a21   $0461    sDogSubFuncs
Section:       0   $1a22         $0000    sClay
Data:        132   $1a22-$1aa5   $0084
Data:         50   $1aa6-$1ad7   $0032    sScreenFlash
Data:        251   $1ad8-$1bd2   $00fb    sScore
Data:       1091   $1bd3-$2015   $0443    sHUD
Data:        403   $2016-$21a8   $0193
Data:        638   $21a9-$2426   $027e    sIRQSplits
Data:        500   $2427-$261a   $01f4    sMisc
Data:        110   $261b-$2688   $006e    sRLE
Data:         62   $2689-$26c6   $003e    sBitmapClear
Data:        330   $26c7-$2810   $014a    sBitmapDepack
Data:        284   $2811-$292c   $011c    sDigi
Data:        362   $292d-$2a96   $016a    sMenu
Data:        101   $2a97-$2afb   $0065    sTitleScreen
Data:        215   $2afc-$2bd2   $00d7    data
Data:        474   $2bd3-$2dac   $01da    sDuckBitmapData
Gap:          83   $2dad-$2dff   $0053
Data:       2872   $2e00-$3937   $0b38
Gap:       18120   $3938-$7fff   $46c8
Data:          9   $8000-$8008   $0009
Data:       2700   $8009-$8a94   $0a8c    data2
Gap:        1387   $8a95-$8fff   $056b
Data:       8253   $9000-$b03c   $203d
Data:       2362   $b03d-$b976   $093a    data3
Passes:            5
Although sadly it doesn't give you Code vs Data in its type, so a naming convention as mentioned before would be required. But I should talk to Soci about it, would be nice since it has an internal code type to be able to set the section type.
t1lt wrote: Thu May 08, 2025 3:41 pm Dang would love some better tools for this myself was just battling with dasmx this very day! :oops:
have you seen https://csdb.dk/release/?id=247992
User avatar
segaloco
Posts: 568
Joined: Fri Aug 25, 2023 11:56 am
Contact:

Re: Assembly Refactoring Tools?

Post by segaloco »

Oziphantom wrote: Thu May 08, 2025 11:04 pm Probably be faster to add it da65 than make external tools.
My reasoning for independent tools is they theoretically could be adapted to any assembler, I may be invested in ca65 right now but may not always be, tightly coupling a fix down into a specific suite would preclude any portability considerations to other assemblers.
Oziphantom
Posts: 1918
Joined: Tue Feb 07, 2017 2:03 am

Re: Assembly Refactoring Tools?

Post by Oziphantom »

segaloco wrote: Fri May 09, 2025 10:33 am
Oziphantom wrote: Thu May 08, 2025 11:04 pm Probably be faster to add it da65 than make external tools.
My reasoning for independent tools is they theoretically could be adapted to any assembler, I may be invested in ca65 right now but may not always be, tightly coupling a fix down into a specific suite would preclude any portability considerations to other assemblers.
Impossible, in order to refactor the assembly the tool must be able to parse the assembly, each assembler has its own commands, format, supported text encoding and feature set. Then after refactoring it must replicate said assemblers format. So you either have some generic tool that has 20 parsers and 20 writers to handle them all, never going to happen. Or you might be able to get a parser that can handle a specific set of assembler to 60% or so. I.e . vs ! for commands *= vs ORG etc so the tool takes ! and ORG then it will work on a few while one that takes . and * would work on a few others mostly. But then in order to work out things it will need to be able to handle CA65's objects + cfg linker adjustment to try and infer where and what things are, vs WLA-DX objects and linker setup which is different to parse and work again, vs 64tass internal sections, blocks and logicals, vs ACMEs none of the above.
User avatar
segaloco
Posts: 568
Joined: Fri Aug 25, 2023 11:56 am
Contact:

Re: Assembly Refactoring Tools?

Post by segaloco »

Lots of assemblers have alphanumeric labels ending with a colon, some sort of character after which comments are located, commas separating fields of opcodes, opcodes that represent call and return vs one-way flow control. Yeah you don't have the exact same opcodes but the abstract elements do have a finite number of syntactic variations. I still think that'd be easier to adjust than trying to forklift a bunch of stuff written into the source code of a specific suite. Doesn't have to be perfect, I'm not saving lives here.
Oziphantom
Posts: 1918
Joined: Tue Feb 07, 2017 2:03 am

Re: Assembly Refactoring Tools?

Post by Oziphantom »

but in order to be effective, it will need to also know what the actual assembled values are. I.e if you give it lda myLabel it can't tell if myLabel is a ZP or Abs address unless it knows where myLabel is. It might be explicitly placed in the asm or it may be up to the assembler to work out where it is. To which this tool then needs to be able to parse the binary and have a line of code to memory address lookup so it can go and check, or at the very least have a symbol table to look up and then it can start to count the bytes.
Post Reply