Page 1 of 2

Looping over large chunk of memory

Posted: Thu Sep 13, 2018 2:04 pm
by battagline
Hi everyone... wasn't sure if this should go here or in NESDev, but it's kind of a noob question so I figured I'd start here.


Let's say I have a label and a chunk of 500 bytes of data I want to loop over

Code: Select all

datachunk:
.byte $20,$21,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00, ... etc out to 500 bytes
So I want to start loading at data chunk and loop until I have done some sort of processing on all 500 bytes

Code: Select all

ldx #0
loop:
lda datachunk, x
inx
(do something with the data here)
cpx #500
bne loop
So obviously that won't work because x can only go up to 255, so the question is, what's the best way to loop over all 500 entries?

Thanks,
Rick

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 2:10 pm
by lidnariq
Use indirect indexed addressing (e.g. LDA (zp),Y)

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 2:33 pm
by pubby
An easy way is to use the .repeat directive (CA65 syntax)

Code: Select all

.repeat 2, i
    ldx #0
:
  lda datachunk+250*i, x
  ; do stuff
  inx
  cpx #250
  bne :-
.endrepeat
If 'do stuff' is large, put it in a subroutine to save bytes.

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 2:39 pm
by battagline
pubby wrote:An easy way is to use the .repeat directive (CA65 syntax)
Interesting. Out of curiosity, what would you do if you had a prime number of items?

Thanks

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 3:33 pm
by nesrocks
Repeat forever until you find an item that starts with #FF or something.

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 3:35 pm
by rainwarrior
Here's an excerpt from a recent project of mine:

Code: Select all

; copy RAM code from ramp.s
.import __RAMP_CODE_LOAD__
.import __RAMP_CODE_RUN__
.import __RAMP_CODE_SIZE__
__RAMP_CODE_END__ = __RAMP_CODE_RUN__ + __RAMP_CODE_SIZE__
@src = nmt_addr ; temporarily alias these pointer variables
@dst = ptr
lda #<__RAMP_CODE_LOAD__
sta @src+0
lda #>__RAMP_CODE_LOAD__
sta @src+1
lda #<__RAMP_CODE_RUN__
sta @dst+0
lda #>__RAMP_CODE_RUN__
sta @dst+1
ldy #0
@ramp_loop:
	lda (@src), Y
	sta (@dst), Y
	inc @src+0
	bne :+
		inc @src+1
	:
	inc @dst+0
	bne :+
		inc @dst+1
	:
	lda @dst+0
	cmp #<__RAMP_CODE_END__
	lda @dst+1
	sbc #>__RAMP_CODE_END__
	bcc @ramp_loop
1. The 16-bit address __RAMP_CODE_LOAD__ is loaded into "@src". (Source data to copy.)
2. The 16-bit address __RAMP_CODE_RUN__ is loaded into "@dst". (Destination.)
3. Use indirect addressing to copy from @src to @dst.
4. 16-bit increment of @src.
5. 16-bit increment of @dst.
6. 16-bit compare @dst against __RAMP_CODE_END__. If less than, repeat from 3.

__RAMP_CODE_END__ here is the address of the first byte past the end of the region to be copied to. Alternatively we could have had another 16-bit variable for the number of bytes to be copied and just decremented it each time and checked for zero. There are a bunch of other manipulations that could be done for optimization or other reasons, but this one basic expression of what it means to "use indirect indexed addressing" for this.

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 3:54 pm
by pubby
battagline wrote: Interesting. Out of curiosity, what would you do if you had a prime number of items?

Thanks
It's really uncommon, but I'd do as many 256 iteration loops as possible, then one extra loop at the end handling the remainder.

Code: Select all

  ldx #0
.repeat ((8191 + 255) / 256), i
  .if i = ((8191 + 255) / 256) - 1
    ldx #8191 .mod 256
  .endif
:
  lda datachunk+256*i, x
  jsr do_stuff
  inx
  bne :-
.endrepeat
Something like this. I haven't tested it though.

Re: Looping over large chunk of memory

Posted: Thu Sep 13, 2018 7:40 pm
by tokumaru
The canonical way is to use indirect indexed addressing, like lidnariq said. While lda ABS, x adds an 8-bit offset to a constant base address in order to find the final address to access, lda (ZP), y uses a 16-bit pointer in Zero Page as the base, meaning you're free to access the entire 16-bit address space of the 6502.

Indirect addressing is obviously slower, so in situations where you need speed, you can be creative and figure alternatives ways of accessing the data faster, usually involving rearranging the data in some way other than a linear array, but there's no generic fast solution for all cases.

Re: Looping over large chunk of memory

Posted: Fri Sep 14, 2018 11:04 am
by battagline
Thanks guys, this has been really helpful.

QQ for rainwarrior:
Does

Code: Select all

inc @dst+0
do the same thing as

Code: Select all

inc @dst
If so, why add the +0, and if not what is the difference?

Thanks

Re: Looping over large chunk of memory

Posted: Fri Sep 14, 2018 11:07 am
by tepples
They do the same thing. The +0 is there to remind the (human) reader of the source code that the variable is a multi-byte variable.

Re: Looping over large chunk of memory

Posted: Fri Sep 14, 2018 11:15 am
by battagline
tepples wrote:They do the same thing. The +0 is there to remind the (human) reader of the source code that the variable is a multi-byte variable.
Ahh... that makes sense.

Thanks

Re: Looping over large chunk of memory

Posted: Fri Sep 14, 2018 6:33 pm
by rainwarrior
Yeah, that's just a convention I use for 16-bit (or wider) values in 6502 assembly.

I don't know if other people do this, but like tepples suggested the +0 helps remind me that I'm looking at an operation that's going to have more than one byte to it. It's more of an automatic habit at this point.

Re: Looping over large chunk of memory

Posted: Sat Sep 15, 2018 7:49 am
by tokumaru
I've always used +0 when referencing the first byte of multi-byte variables, and sometimes people call me out for it, but I think it greatly improves readability.

Re: Looping over large chunk of memory

Posted: Sat Sep 15, 2018 9:02 am
by nesrocks
Is what I said totally a bad idea? Or doesn't it apply? The advantage is that the list of objects can be any size. The disadvantage is that it can't have one specific byte value (or bit) at a certain point in the data array. You do need to check for that $FF byte on every item, but when using a counter you need to check against it, so I guess it's the same on that aspect, optimization-wise.
And in the OP case, I imagine it's one big object of 500 bytes? In that case you could reserve $FF to be the end of the data.

Re: Looping over large chunk of memory

Posted: Sat Sep 15, 2018 1:33 pm
by rainwarrior
nesrocks wrote:Is what I said totally a bad idea? Or doesn't it apply? The advantage is that the list of objects can be any size. The disadvantage is that it can't have one specific byte value (or bit) at a certain point in the data array. You do need to check for that #FF byte on every item, but when using a counter you need to check against it, so I guess it's the same on that aspect, optimization-wise.
And in the OP case, I imagine it's one big object of 500 bytes? In that case you could reserve #FF to be the end of the data.
Well, strings terminated with a 0 byte are incredibly common in the world, copying which is the same process as you're describing.

But yes I'd say not being able to represent one particular byte in your data is a problem in the generic case. The utility of strcpy vs memcpy. One is for a specific kind of data that indicates its own length, one is for any data but you have to supply the length.