Page 1 of 2

Bit reversing

Posted: Sun Aug 20, 2006 7:20 pm
by Celius
Hi.

I've been recently looking into things about the 6502 that I never really payed attention to, like comparing if something is greater or less than something else (Yeah, I really didn't know how to do that). And it was quite apparent that I needed to, so I did. Do not fear, I understand 6502 quite well, and I program with it. I'm not a newbie, there are just some dumb things like that that I've never really payed attention to. I've been wondering about reversing a byte, though. It seems like a complicated routine, but I was wondering if there was an easier way... What I mean by reversing a byte is this:

Take this: 10010100

And Make it: 00101001

Does anyone know?

Posted: Sun Aug 20, 2006 7:30 pm
by Quietust
Reverse bit order? That's easy:

Code: Select all

    STA temp
    LDX #$08
loop:
    LSR temp
    ROL A
    DEX
    BNE loop
The trick is knowing that ASL and LSR shift the "discarded" bit into the Carry flag, and ROL/ROR will shift the carry flag in at the new location.

Next question? :D

Posted: Sun Aug 20, 2006 7:33 pm
by tepples
Want to make it faster? Unroll the loop:

Code: Select all

  sta temp

  lsr temp
  rol a
  lsr temp
  rol a
  lsr temp
  rol a
  lsr temp
  rol a

  lsr temp
  rol a
  lsr temp
  rol a
  lsr temp
  rol a
  lsr temp
  rol a
Or use a lookup table. The proper method to use depends on why you want to reverse bits in a byte.

Posted: Sun Aug 20, 2006 7:54 pm
by Celius
Thanks for that. I've never really been to good at understanding why some shifting methods work. I looked at a routine that Tokumaru made, it was a hex to decimal routine. I still do not understand why that works. It'd do me well to understand why, I suppose. I was wondering about this for reversing graphics with CHR RAM. As in background tiles, not sprites. I know how to do that. Thanks though!

tepples wrote:
Want to make it faster? Unroll the loop:

Code:

sta temp

lsr temp
rol a
lsr temp
rol a
lsr temp
rol a
lsr temp
rol a

lsr temp
rol a
lsr temp
rol a
lsr temp
rol a
lsr temp
rol a

Or use a lookup table. The proper method to use depends on why you want to reverse bits in a byte.


Oh, but that wastes alot of space... It's like the MMC1 bankswitching technique:

lda #$XX
sta $E000
lsr a
sta $E000
lsr a
sta $E000
lsr a
sta $E000
lsr a
sta $E000

I just do:

ldx #5
lda #$XX
-
sta $E000
lsr a
dex
bne -

I like to save space, beings as we work with such a space-unplentiful system.

Posted: Sun Aug 20, 2006 8:18 pm
by tepples
We also work with a time-unplentiful system, and for things your program does often, those extra 5 cycles of dex/bne per iteration add up. And for reversing relatively big amounts of data such as tile graphics, I would strongly suggest using a lookup table.

Posted: Mon Aug 21, 2006 3:41 am
by Bregalad
tepples wrote:And for reversing relatively big amounts of data such as tile graphics, I would strongly suggest using a lookup table.
This all depend if you buffer the reversed tile or if you do that in NMI.

Posted: Mon Aug 21, 2006 8:54 am
by kyuusaku
Here's a LUT:

Code: Select all

.db $00, $80, $40, $c0, $20, $a0, $60, $e0
.db $10, $90, $50, $d0, $30, $b0, $70, $f0
.db $08, $88, $48, $c8, $28, $a8, $68, $e8
.db $18, $98, $58, $d8, $38, $b8, $78, $f8
.db $04, $84, $44, $c4, $24, $a4, $64, $e4
.db $14, $94, $54, $d4, $34, $b4, $74, $f4
.db $0c, $8c, $4c, $cc, $2c, $ac, $6c, $ec
.db $1c, $9c, $5c, $dc, $3c, $bc, $7c, $fc
.db $02, $82, $42, $c2, $22, $a2, $62, $e2
.db $12, $92, $52, $d2, $32, $b2, $72, $f2
.db $0a, $8a, $4a, $ca, $2a, $aa, $6a, $ea
.db $1a, $9a, $5a, $da, $3a, $ba, $7a, $fa
.db $06, $86, $46, $c6, $26, $a6, $66, $e6
.db $16, $96, $56, $d6, $36, $b6, $76, $f6
.db $0e, $8e, $4e, $ce, $2e, $ae, $6e, $ee
.db $1e, $9e, $5e, $de, $3e, $be, $7e, $fe
.db $01, $81, $41, $c1, $21, $a1, $61, $e1
.db $11, $91, $51, $d1, $31, $b1, $71, $f1
.db $09, $89, $49, $c9, $29, $a9, $69, $e9
.db $19, $99, $59, $d9, $39, $b9, $79, $f9
.db $05, $85, $45, $c5, $25, $a5, $65, $e5
.db $15, $95, $55, $d5, $35, $b5, $75, $f5
.db $0d, $8d, $4d, $cd, $2d, $ad, $6d, $ed
.db $1d, $9d, $5d, $dd, $3d, $bd, $7d, $fd
.db $03, $83, $43, $c3, $23, $a3, $63, $e3
.db $13, $93, $53, $d3, $33, $b3, $73, $f3
.db $0b, $8b, $4b, $cb, $2b, $ab, $6b, $eb
.db $1b, $9b, $5b, $db, $3b, $bb, $7b, $fb
.db $07, $87, $47, $c7, $27, $a7, $67, $e7
.db $17, $97, $57, $d7, $37, $b7, $77, $f7
.db $0f, $8f, $4f, $cf, $2f, $af, $6f, $ef
.db $1f, $9f, $5f, $df, $3f, $bf, $7f, $ff

Posted: Mon Aug 21, 2006 11:17 am
by Bregalad
This table would go well for the super fast but super space consumming method, that you'd typically use if you want to rush reversing a byte when updating tile data during VBlank.

You'd have to be a lot bored to build this table up. Personally, I'd avoid doing such hard work unless I'm really forced to, heh. I just don't see what is so facinating in writing a table with a lot of .db and numbers after that.

Posted: Mon Aug 21, 2006 12:24 pm
by kyuusaku
I already had the table; I use it to dump Turbo Grafx-16 games which have mirrored data pins from a PC-engine ;)

Posted: Mon Aug 21, 2006 1:02 pm
by blargg
Spending 256 bytes to greatly speed up a routine that is critical to a game is a good choice. If it allows avoidance of more time-consuming optimization elsewhere, it's a win. It's not like the table has to be repeated; one copy is all that's needed.

A programmer would obviously let the computer build the above bit reversing table. Even if tedium weren't an issue, building it by hand would be error-prone. But the above doesn't matter, since above we have the table ready for copy-and-paste above.

EDIT: Just to be sure I overuse that word (for whatever reason), I'll write it a few more times: above above above.... above! Back to your regularly scheduled posts.

Posted: Mon Aug 21, 2006 1:06 pm
by Bregalad
Just to say the idea to make this style of things by hand made me sick. Unfortunately sometimes there is no choise.
And honnestly a loop that just roll shifts two values and repeat 8 times isn't really much a waste of time. In some other condition, such as loops that could repeat much more than 8 times and/or do much more work than just shifting two values, the table solution would be worthy.

Posted: Mon Aug 21, 2006 2:20 pm
by Disch
I agree 100% with blargg. In the grand scheme of things, 256 bytes is nothing. Especially when you consider the time it will save in-game.

Bregalad wrote:And honnestly a loop that just roll shifts two values and repeat 8 times isn't really much a waste of time.
For just one tile, I'd agree. But if this is going to be a common occurance, that is a major waste of processing time.

Q's loop takes 100 cycles
tepples' unrolled loop takes 59 cycles

using a lookup table with:

Code: Select all

tax
lda table,X
takes a grand total of 6 cycles.

This is not peanuts we're talking about here... the speed difference is astounding. When you consider this process will have to be done SIXTEEN TIMES for a single tile to be flipped... we're talking major slowdowns. using Q's loop and dedicating your entire frame to flipping tiles, you'll be able to convert maybe 18-19 tiles. Whereas using a lookup table... 19 tiles can be converted in roughtly 2000 cycles (~17.5 scanlines)

Posted: Mon Aug 21, 2006 2:53 pm
by Celius
This is true. It really does make a huge difference in speeds. I think the lookup table is a good idea if you need it to be fast, but if you are leaving the screen black for about 1 second (60 frames), I think you can handle saving the space. When you're making an RPG, you MUST conserve as much space as possible. Especially when you have 350+ maps, a huge world map, a ton of graphical data, and all this text and all these events going on in your game, you really don't want to dedicate 256 bytes to anything that could possibly be shrinked down to about 20 or so. Wasting space is not a good thing to do. I say go with the table if you're making a game that moves fast, but if you're making an RPG, dude, save the space, and go with the small routine.

Posted: Mon Aug 21, 2006 3:33 pm
by blargg
If you're trying to conserve space in a program, optimizing tile flipping like this would be one of the last places to look (unless it's an entry to the 2K/4K competition or something). The savings is at most 256 bytes, which wouldn't be worth the time. Much better places are where the savings is multiplied by some large amount of data, like levels, graphics, or text. These are the areas where big savings can often be made as the data are rarely in the most compact form possible. Another possibility is an interpreted scripting language in place of some portions of machine code for games like Maniac Mansion where there's a significant amount of game logic that doesn't have to run continuously.

The general point is that micro-optimization can obscure the big picture and opportunities for major optimization. The most productive areas for optimization are specific to each program. A particular program might arrange data in a way that allows for a major optimization that wouldn't be relevant in any other program.

Posted: Mon Aug 21, 2006 9:38 pm
by Memblers
If you're gonna bit-reverse a whole bunch of tiles, you could also generate that table in RAM before doing it. :) Though I guess if you have time to do that, you're in no hurry anyways (or you have RAM to spare, if you keep it persistant).

But yeah, like blargg said, optimize where it counts. Don't sweat the little stuff. Code size (including small-ish lookup tables) is usually nothing compared to data. With 512kB, give yourself a 32kB bank for code, and probably the rest of the ROM for data.