Page 1 of 2

A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 11:08 am
by russellsprouts
I've been working a bit on a compiler that targets the 6502 (yes, I know there are plenty of things that are already good enough, but its mostly just a fun project to work on for me). It supports operations on 1, 2, 3, and 4 byte values, including the usual 6502 operations, plus variable shift operations (including arithmetic shift right), multiply, divide, modulo, logical operations, and others.

David Wheeler's ideas were a big help in figuring out the runtime for a high level language. I decided to build a split stack machine.

There are 4 data stacks -- STACK0, STACK1, STACK2, and STACK3. STACK0 could be on the zero page to save code space and allow some ldy ZP, x instructions, and STACK1, STACK2, STACK3 can be elsewhere. There is only one stack pointer, stored in the X register. This allows stack access quite quickly using indexed absolute addressing. Multi-byte values are stored across multiple stacks. For example, the bytes of a 24-bit integer would be stored at STACK0+x, STACK1+x, and STACK2+x. Manipulating the stack pointer is also very cheap, it only takes a dex or inx.

As an optimization, the current top-of-stack value is stored contiguously in the zero page at TOS. This means that most operations, which conceptually pop 2 values from the stack, then push the result back on the stack, only have to modify TOS in-place and then increment x in order to calculate their values.

Here's an example, the 2-byte addition operation:

Code: Select all

add2:                  ; 30 cycles total
        clc            ; 2 cycles
        lda STACK0, x  ; 4 cycles
        adc TOS        ; 3 cycles
        sta TOS        ; 3 cycles
        lda STACK1, x  ; 4 cycles
        adc TOS+1      ; 3 cycles
        sta TOS+1      ; 3 cycles
        inx            ; 2 cycles
        rts            ; 6 cycles
A generated program would look something like this:

Code: Select all

main: ; calculates $1234 + $1337
        ; put $1234 on the stack
        lda #<$1234
        sta STACK0, x
        lda #>$1234
        sta STACK1, x

        ; put $1337 on the top of the stack
        lda #<$1337
        sta TOS
        lda #>$1337
        sta TOS+1

        jmp add2
There are a few things that I am considering to improve the performance. Currently, the stack usage is very wasteful. If you use 1 byte operands, then STACK1, STACK2, and STACK3 are unused. Because the 1 and 2 byte operations take less code space than the 3 and 4 byte operations, I could create, for example, 1 byte operations the operate on STACK1 instead of STACK0, and 2 byte operations that operate on STACK2 and 3 instead of 0 and 1. Packing multiple values into one stack slot could be powerful, but it would also be very difficult to write a compiler that can take full advantage of it. A good compiler would generate normal 6502 code for 1 byte instructions anyway, rather than using the stack.

Re: A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 12:28 pm
by Bregalad
Why would you even need those LDA/STAs in your main code? Just encode them as byte codes to push values on your stack.

I doubt 24-bit and 32-bit calculations are very useful, ever, in the context of Nesdev. 16-bit however is extremely useful so using 2 separate stack for LSB and MSB and avoid the overhead of separating 16-bit and 8-bit operations and just do everything on 16-bit and ignore the MSB when appropritate is actually a good idea, in my opinion.

Re: A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 1:11 pm
by russellsprouts
It's true that 24 and especially 32 bit instructions aren't often useful, but I've sometimes wanted to use them. For example, storing a position on a large map. Super Mario Bros 3 uses a 24 bit number for each coordinate of a player's position. (Actually, more like 20 bits, because it only goes to 1/16 of a pixel precision).

I don't embed the constants because there is a trade-off on speed and code size. I could use a byte code interpreter, like Sweet16, and it could support all of the operations in one byte easily since a stack machine doesn't need to encode the operands in an instruction. This would be slower. However, they aren't mutually exclusive. I could sometime use an interpreter for code compactness, and direct subroutine calls for speed.

Re: A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 1:25 pm
by tepples
Bregalad wrote:I doubt 24-bit and 32-bit calculations are very useful, ever, in the context of Nesdev.
Horizontal coordinate of a character in a long scrolling level: 0-7 for subpixel, 8-15 for pixel, 16-23 for screen.

Re: A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 1:36 pm
by Bregalad
Ok, but it's the ONLY purpose of more than 16-bit I can think of, and you probably don't want to do the whole collision detection in 24-bit, especially if it's not hand-optimized assembly, as this code will run every frame.

Re: A Runtime for high-level languages on the 6502

Posted: Thu Aug 04, 2016 2:16 pm
by tepples
Yeah, collision detection with 24-bit coordinates typically disregards subpixels.

Re: A Runtime for high-level languages on the 6502

Posted: Fri Aug 05, 2016 8:40 pm
by psycopathicteen
Bregalad wrote:Why would you even need those LDA/STAs in your main code? Just encode them as byte codes to push values on your stack.
How does that work?

Re: A Runtime for high-level languages on the 6502

Posted: Sun Aug 07, 2016 12:49 pm
by russellsprouts
psycopathicteen wrote:
Bregalad wrote:Why would you even need those LDA/STAs in your main code? Just encode them as byte codes to push values on your stack.
How does that work?
I imagine it would work like in Sweet16. Sweet16 is a virtual assembly language. Most instructions are a single byte, but the byte code $1X followed by a 2 byte number represents LOAD, with loads the two byte number into the register X. Similarly, this could have a byte code for pushing 1, 2, 3, or 4 byte values onto the stack directly.

Re: A Runtime for high-level languages on the 6502

Posted: Sun Aug 07, 2016 12:59 pm
by tepples
russellsprouts wrote:Similarly, this could have a byte code for pushing 1, 2, 3, or 4 byte values onto the stack directly.
Like in "The Princess and the pea"?

Re: A Runtime for high-level languages on the 6502

Posted: Sun Aug 07, 2016 3:53 pm
by psycopathicteen
How is that different from loading and storing?

Re: A Runtime for high-level languages on the 6502

Posted: Sun Aug 07, 2016 8:09 pm
by koitsu
The main difference would be that a 16-bit lda #imm + sta $abs takes 8 cycles and 6 bytes total, while pea $val would only take 5 cycles + 3 bytes total.

Two complications (in my experience with pea): 1) the endian of the bytes pushed might not match what your existing code expects, 2) the values pushed are statically-defined at assemble time (i.e. instruction pushes a predetermined value, not what's in the accumulator); the "workaround" for the latter is to use self-modifying code so that the pea+operand bytes are in RAM and the latter can be changed dynamically.

Whether or not this provides you any cycle-count or space benefits over, stay, a classic lda/sta or lda/pha etc. is, obviously, indeterminable at this time.

P.S. -- Man, I haven't heard of Sweet16 in decades. Blast from the past hitting me hard. :-)

Re: A Runtime for high-level languages on the 6502

Posted: Mon Aug 08, 2016 1:18 pm
by psycopathicteen
I thought only the 65816 had PEA.

Re: A Runtime for high-level languages on the 6502

Posted: Mon Aug 08, 2016 1:39 pm
by tepples
Among the 6502 family, only the 65816 has pea as a hardware instruction. Perhaps I failed to make my point clear: an interpreted bytecode implemented on any processor could have an instruction with the same semantics (push 16 bits immediate), and it would be recognizable to those with 65816 experience as being pea.

Re: A Runtime for high-level languages on the 6502

Posted: Mon Aug 08, 2016 2:11 pm
by psycopathicteen
I'm still confused on what Bregalad said.

Re: A Runtime for high-level languages on the 6502

Posted: Tue Aug 09, 2016 5:56 pm
by psycopathicteen
I had an idea with implementing math with long integers on the 65816.

If you have the C code:

Code: Select all

long a;
long b;
long c;

a = a + b + c;
It could be compiled to:

Code: Select all

lda a_lo
clc
adc b_lo
tax
lda a_hi
adc b_hi
tay
txa
clc
adc c_lo
sta a_lo
tya
adc c_hi
sta a_hi
Basically using X and Y together as a 32-bit accumulator, and switching back and forth.