True, absolute addressing will take one more cycle than ZP addressing, and indexing will take one cycle more than non-indexed; but if you already have local variables and environments, recursion does not increase the requirements any. That was my point. I thought the context here was an HLL though, since C has been mentioned; and in that case you're making a big performance sacrifice anyway. Actually, I don't recall ever using recursion in my own programming; but I often use re-entrancy, particularly where an interrupt-service routine for an alarm may call a routine that got interrupted.Bregalad wrote: ↑Wed May 13, 2020 2:24 amOf course it is a problem, why would you sacrifice a register for being your stack pointer, and use longer, slower instructions, and possibly waste a lot of ROM space, when you're not going to use recursion anyway ?!Garth wrote: ↑Wed May 13, 2020 2:10 am It seems like some key characteristics of stacks are perhaps not being understood correctly. There's a treatise on 6502 stacks (plural, not just the page-1 hardware stack) at http://wilsonminesco.com/stacks/ . Later chapters do get into local variables and environments, and recursion. You may not need recursion; but once you have local variables and environments, there's nothing keeping you from doing recursion. It's simply not a problem, as long as you don't run out of stack space.
Compiled stack proposal
Moderator: Moderators
Re: Compiled stack proposal
http://WilsonMinesCo.com/ lots of 6502 resources
Re: Compiled stack proposal
Hisoft C is C compiler with 24Kb core within target machine (ZX Spectrum 48).
And it's source code for stdlib library is really cool:
Local variables were slow and you don't believe this trick:
TADAMMM!
There were no type void in this age, so:
Simple and efficient...
First generation of C language (Kernighan and Ritchie first books!) was very different to modern realities, so functions with non void and int return types requre some declarations:
var_args was implemented in non standard way:
Type casts were made in non-standard way too:
There was support of 32-bit:
Really! In this way...
And meet this true horror of programming in 1980:
Thew...
But isn't #define FAST static cool?
And it's source code for stdlib library is really cool:
Code: Select all
/*****************************/
/* Hisoft C */
/* Standard Function Library */
/* HEADER */
/* */
/* Copyright (C) 1984 Hisoft */
/* Last changed 15 Aug 1985 */
/*****************************/
Code: Select all
#define FAST static
There were no type void in this age, so:
Code: Select all
#define void int
First generation of C language (Kernighan and Ritchie first books!) was very different to modern realities, so functions with non void and int return types requre some declarations:
Code: Select all
/* Forward declarations for non-int library functions */
extern char *strcat(), *strncat(), *strcpy(), *strncpy(), *strchr(), *strrchr(),
*strpbrk(), *calloc(), *malloc(), *sbrk(), *fgets(), *gets();
Code: Select all
/* Two variadic arithmetic functions (see manual for details) */
int max(param_byte_count) auto
{
static int argc, *argv, max;
argc = param_byte_count/2 - 1;
argv = ¶m_byte_count + argc;
max = -32767;
while (argc--)
{
if (*argv > max) max = *argv;
--argv;
}
return max;
}
Code: Select all
typedef char * __char_ptr;
int peek(address)
{
return * cast(__char_ptr) address;
}
void poke(address, value)
{
* cast(__char_ptr) address = value;
}
Code: Select all
void long_add(c, a, b)
char *a, *b, *c;
{
static unsigned u, i;
u = 0;
for (i = 0; i < 4; ++i)
{
u += *a++ + *b++;
*c++ = u & 0xff;
u >>= 8;
}
}
And meet this true horror of programming in 1980:
Code: Select all
#define FALSE 0 /* for Boolean operations */
#define TRUE 1
But isn't #define FAST static cool?
Re: Compiled stack proposal
Damn son.aa-dav wrote: ↑Wed May 13, 2020 11:35 am Hisoft C is C compiler with 24Kb core within target machine (ZX Spectrum 48).
And it's source code for stdlib library is really cool:Local variables were slow and you don't believe this trick:Code: Select all
/*****************************/ /* Hisoft C */ /* Standard Function Library */ /* HEADER */ /* */ /* Copyright (C) 1984 Hisoft */ /* Last changed 15 Aug 1985 */ /*****************************/
TADAMMM!Code: Select all
#define FAST static
There were no type void in this age, so:Simple and efficient...Code: Select all
#define void int
First generation of C language (Kernighan and Ritchie first books!) was very different to modern realities, so functions with non void and int return types requre some declarations:var_args was implemented in non standard way:Code: Select all
/* Forward declarations for non-int library functions */ extern char *strcat(), *strncat(), *strcpy(), *strncpy(), *strchr(), *strrchr(), *strpbrk(), *calloc(), *malloc(), *sbrk(), *fgets(), *gets();
Type casts were made in non-standard way too:Code: Select all
/* Two variadic arithmetic functions (see manual for details) */ int max(param_byte_count) auto { static int argc, *argv, max; argc = param_byte_count/2 - 1; argv = ¶m_byte_count + argc; max = -32767; while (argc--) { if (*argv > max) max = *argv; --argv; } return max; }
There was support of 32-bit:Code: Select all
typedef char * __char_ptr; int peek(address) { return * cast(__char_ptr) address; } void poke(address, value) { * cast(__char_ptr) address = value; }
Really! In this way...Code: Select all
void long_add(c, a, b) char *a, *b, *c; { static unsigned u, i; u = 0; for (i = 0; i < 4; ++i) { u += *a++ + *b++; *c++ = u & 0xff; u >>= 8; } }
And meet this true horror of programming in 1980:Thew...Code: Select all
#define FALSE 0 /* for Boolean operations */ #define TRUE 1
But isn't #define FAST static cool?
-
I agree with tepples but I suspect it may be much more work than intended since the paging NES method must make relative calls within a desired scope (from the processor) and the global program scope (when reentrant calls sharing variables). Now, I hope you guys manage to do it and you've got my support!!
Oh, when done, please, add a lot of test cases when compiling different code. That will literally save a ton of wasted hours.
Re: Compiled stack proposal
I made something like this once, it's called pa65.
I have not yet publicly exposed it because it lacks a good documentation.
If you want to try, here it is https://github.com/Parisoft/pa65
I have not yet publicly exposed it because it lacks a good documentation.
If you want to try, here it is https://github.com/Parisoft/pa65
Re: Compiled stack proposal
Exactly. Ideally you would need special key instructions hidden in the instructions only at relevant places, such as start/stop the tree system (around interupt routines for example; or special routines doing trickery beyond the normall call/return system such as my boss routines).
Maybe the program would output a list of .define statements which could in turn be included in the assembly process ?
For example when using semi-auto allocated variables it would be like:
Code: Select all
Myroutine:
lda myroutine_temp1 ; this variable is auto-allocated
clc
adc player_HP ; This one is definitely not
sta myroutine_temp2 ; etc...
jsr something ; the parser understands automatically we call something without any extra info needed
bcc _next:
rts ; The parser understands automatically that's a possible end of myroutine
_next:
lda myroutine_temp3 ; but figures there's still another end ahead
...
jmp something_else ; the parser should understand this is a tail-call
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Compiled stack proposal
Oh yeah Trampolines, didn't even think of those.. they will basically become the nexus of your project and make everything dependent on everything. A "unlink" this function would be needed.
Re: Compiled stack proposal
Self-modifying code can be handled the same way so long as variable addresses aren't modified to point at a caller's variables. I haven't put much thought into it seeing as I need as much of the 2048 bytes of internal work RAM for game state as I can get.Oziphantom wrote: ↑Wed May 13, 2020 10:46 am SelfMod, Dispatch via RTS, Double Interrupt stabilization, Main, IRQ, NMI and potentially BRK threads to consider, stack manipulation, rts chaining and just well I want to return a couple up so lets just pop a bunch of stuff off the stack and go for it.
RTS dispatch from a jump table is straightforward using tailcalls, which treats every entry in the jump table as a callee. I haven't considered RTS dispatch in the context of jsr coroutine_yield; I imagine it to be a lot trickier.
I'm not sure what "Double Interrupt stabilization" is, but main, IRQ, and NMI threads could each get their own static stack. Ideally, IRQ and NMI wouldn't be very deep anyway unless you're trying to (say) run the game in NMI and low-priority background tasks that can block (like art decompression) in main. And I hadn't considered syntax to add to one stack or the other because my IRQ and NMI routines have never been that deep.
rts chaining and just well I want to return a couple up so lets just pop a bunch of stuff off the stack Unless I'm misunderstanding something, multi-layer return wouldn't need to do anything, as the intermediate callee's local variable lifetimes are ending too. Could you explain what you mean by "RTS chaining"?
If subroutine X jumps into the middle of subroutine Y, this can be done one of two ways:Oziphantom wrote: ↑Wed May 13, 2020 10:46 am Having function jump into the middle of other functions
- Subroutine Y is actually two subroutines: Y1 and Y2, with X jumping to the start of Y2, and Y1 falling through to Y2. Fallthrough is marked as a tail call.
- Even simpler: Mark that X calls Y when it jumps to the middle of Y. This makes all of Y1's variables and all of Y2's variables live.
Code: Select all
.proc Myroutine
var temp1
var temp2
lda temp1 ; this variable is auto-allocated
clc
adc player_HP ; This one is definitely not
sta temp2 ; etc...
call something ; tell the parser that something participates in this static stack in case something is in another source code file
bcc _next
rts ; Variable lifetime ends when a routine returns
_next:
lda myroutine_temp3 ; but figures there's still another end ahead
...
tailcall something_else ; even if something_else is in another file
.endproc
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Compiled stack proposal
making this NES 6502 rather than 6502 seems a bit of a waste. However you do also have 8K expansion RAM for the NES to consider.
Double Interrupt stabilization:
When you interupt you have a 0-8 clock window of jitter, so to defeat this you can do
Make Interupt fire on Line 10
so basically there are 2 interrupts but only 1 rti
The other trick is "executing I/O registers" so you start a timer that counts down then you set the byte before the timer, to be JMP or JSR as you need. and the set the byte after the timer to be 00.
thus as the clocks go you get
JMP $0800
JMP $0700
JMP $0600
....
in a loop, so you set the IRQ handler to the I/O registers so the IRQ fires, then it executes the JMP with the timer byte high given you a routine you enter knowing the jitter on.
rts chaining
Given
where the macros expand as btw .rta inserts the return address of an object, so basically .word thing-1.
Then you set the stack to be where this data is(or copy it onto the stack), then you RTS to which the RTS then pops off the first address and thus calls it, pops its params and when done, just RTS which calls the next, and the next until it finally pops the last exit function. This way you can change the call order, don't have to put any logic in to work out what is next, no loops and its not statically linked.
multi layer return
Yes the scopes end, but you need to detect that when you do a this ends the scope of it and 2 parents, and this function will not continue into any code after said two jsrs
Having function jump into the middle of other functions
I think a flaw with the current example is calling is not only done with a JSR, calling is also done with a JMP or Branch
so given the following structure
So the paths are
Entry -> A -> C -> E
Entry -> B -> C -> E
D -> C -> E
thus you need to treat
Entry + A + B as a " function"
C + E as a "function"
D as a "function"
but they don't all "call" each other because they are seen as one "group" i.e logically Entry + A + B + C + E make one function and the variable scope needs to be shared, i.e A sets up something to be used in E, and like wise D is part of the chain but a different branch, so it needs to set up things that are also used in C + E, so its variables must be allocated the same way as Entry + A + B
Double Interrupt stabilization:
When you interupt you have a 0-8 clock window of jitter, so to defeat this you can do
Make Interupt fire on Line 10
Code: Select all
;setup irq to fire on next line
;save the current stack pointer
;cli
; waste clocks until near the end of the line if needed
nop
nop
nop
nop ; somewhere along here the next interrupt fires
nop
;next interrupt
; restore stack to pop off the extra interrupts data
since you interrupted on a nop, a jitter is now 0-1
; if you need dead 0 jitter.
; waste almost a line then set it up so the fetch of the cmp in
; lda rasterline
; cmp rasterline
; fall either on the last clock ( so it is the same) or first clock of the next line ( so it is different )
; beq *
; * ; now if the where equal the branch is taken 3 clock, it the differ not taken 2 clocks
; you are now on clock 3 of the line 100%
The other trick is "executing I/O registers" so you start a timer that counts down then you set the byte before the timer, to be JMP or JSR as you need. and the set the byte after the timer to be 00.
thus as the clocks go you get
JMP $0800
JMP $0700
JMP $0600
....
in a loop, so you set the IRQ handler to the I/O registers so the IRQ fires, then it executes the JMP with the timer byte high given you a routine you enter knowing the jitter on.
rts chaining
Given
Code: Select all
#DrawDebugStringStackNumber kDebugStrings.playerx,0,0
#DrawDebugStringStackNumber kDebugStrings.playery,0,1
#DrawDebugStringStackNumber kDebugStrings.playerTileX,0,2
#DrawDebugStringStackNumber kDebugStrings.playerTileY,0,3
#DrawDebugStringStackNumber kDebugStrings.playerTileNum,0,4
#DrawDebugStringStackNumber kDebugStrings.playerColFlags,0,5
#DrawDebugStringStackNumber kDebugStrings.Map,35,0
#DrawDebugStringStackNumber kDebugStrings.MapTCP,40,0
#DrawDebugStringStackNumber kDebugStrings.MapTCP,51,1
.rta TrampolineCallStackDrawExit
Code: Select all
DrawDebugStringStackNumber .macro
.rta jDrawStringAtPrecalc
.byte \1
.word $3000+(80*\3)+\2
.endm
Then you set the stack to be where this data is(or copy it onto the stack), then you RTS to which the RTS then pops off the first address and thus calls it, pops its params and when done, just RTS which calls the next, and the next until it finally pops the last exit function. This way you can change the call order, don't have to put any logic in to work out what is next, no loops and its not statically linked.
multi layer return
Yes the scopes end, but you need to detect that when you do a
Code: Select all
pla
pla
pla
pla
rts
Having function jump into the middle of other functions
I think a flaw with the current example is calling is not only done with a JSR, calling is also done with a JMP or Branch
so given the following structure
Code: Select all
entry
set temp2 to something
some code
branch A
some code
branch B
rts
A
some code
set temp1 to something
jmp C
B some code
set temp1 t something
jmp C
D some code
set temp1 to something
set temp2 to something
branch C
C
some code
use temp1
jmp E
E
use temp2
some code rts
Entry -> A -> C -> E
Entry -> B -> C -> E
D -> C -> E
thus you need to treat
Entry + A + B as a " function"
C + E as a "function"
D as a "function"
but they don't all "call" each other because they are seen as one "group" i.e logically Entry + A + B + C + E make one function and the variable scope needs to be shared, i.e A sets up something to be used in E, and like wise D is part of the chain but a different branch, so it needs to set up things that are also used in C + E, so its variables must be allocated the same way as Entry + A + B
Re: Compiled stack proposal
2022 EuroLLVM Dev Mtg “LLVM-MOS 6502 Backend: Having a Blast in the Past” by Daniel Thornburgh (10 minute video) shows how LLVM is being ported to target the 6502 CPU. This includes static stack allocation and reservation of 32 bytes of zero page as virtual registers.
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Compiled stack proposal
sure but that is a C compiler which has the benefit of full control over the 6502 it generates, and hence knows all the cases it needs before hand. Not something trying to pass 6502 code and work out what the overlap is.
-
- Posts: 6
- Joined: Sat Apr 23, 2022 12:34 pm
Re: Compiled stack proposal
I'm actually working on extending this to:tepples wrote: ↑Wed Jun 15, 2022 12:54 pm 2022 EuroLLVM Dev Mtg “LLVM-MOS 6502 Backend: Having a Blast in the Past” by Daniel Thornburgh (10 minute video) shows how LLVM is being ported to target the 6502 CPU. This includes static stack allocation and reservation of 32 bytes of zero page as virtual registers.
1) Automatically promote static stack slots to the zero page, based on whole-program estimates of hotness, and
2) Allow ZP locations/static stack slots to overlap if the compiler can prove that their two containing functions can't be active at the same time.
You'd have to tell the compiler how many contiguous zero page locations it can expect; our SDK targets would provide maximal defaults. If you wanted to reserve zero page for application use (i.e., outside the compiler), you'd need to pass something like --mreserve-zp=64, which would cut out 64 bytes of zero page from the chunk allocated to the compiler.
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Compiled stack proposal
side note for Commodore machines you will need an option to set the lowest zp address it uses, as the 00 and 01 are "registers" internal to the CPU. Even in the 8502 when you invoke the MMU's ZP relocation the lower 2 ZP remain internal register.