My attempt at RLE compression

Discussion of hardware and software development for Super NES and Super Famicom.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
User avatar
nicklausw
Posts: 376
Joined: Sat Jan 03, 2015 5:58 pm
Location: ...
Contact:

My attempt at RLE compression

Post by nicklausw »

I wanted a reason to practice my python skills, and so I've tried to make my own RLE compression thingy for the SNES.

The way the format works is like this. Say you have these bytes:

Code: Select all

00 ff 00 00 ff ff 00 00 00 00 00 00 ff ff ff ff ff ff
The new file is made up of pairs of two: one byte for how many of the second byte should be copied.
An FF is appended at the end to mark where the routine should stop copying, so the output would be:

Code: Select all

01 00 01 ff 02 00 02 ff 06 00 06 ff ff
The .py file does this perfectly:

Code: Select all

#!/usr/bin/env python3
# basic rle compression program
# by nicklausw

# this format works as follows.
# one byte is read, and that is
# how many of the next byte to copy over. repeat.
# when the first byte is $ff, the copy is done.

import sys
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("in_file", help="the chr data to be rle'd")
parser.add_argument("out_file", help="the rle'd output")
args = parser.parse_args()

f_in = open(args.in_file, 'rb')
f_out = open(args.out_file, 'wb')

# python doesn't need variable initialization i think,
# but for clarity's sake.
byte_n = 0 # byte
byte_c = 1 # byte count 

for c in f_in.read():
  if c != byte_n or byte_c == 0xFE:
    f_out.write(bytes([byte_c]))
    f_out.write(bytes([byte_n]))
    byte_c = 1
    byte_n = c
  else:
    byte_c += 1

# now catch the last time around
f_out.write(bytes([byte_c]))
f_out.write(bytes([byte_n]))

# the end mark
f_out.write(bytes([0xFF]))

f_in.close()
f_out.close()
The problem is the routine to copy the bytes to vram:

Code: Select all

.proc rle_copy_ppu
  setxy16
  seta8
  ldy #$00

 loop: 
  seta8
  lda (rle_cp_ram), y
  cpa #$ff
  beq done
  seta16
  and #$ff
  tax
  iny
  seta8
  lda (rle_cp_ram),y
  jsr rle_loop
  iny
  jmp loop
  
done:
  rts

rle_loop:
  seta16
  ;and #$ff
 loop2:
  sta PPUDATA
  dex
  bne loop2
  rts
.endproc
It's called like this:

Code: Select all

; copy font
  setaxy16
  stz PPUADDR  ; we will start video memory at $0000
  lda #font
  sta rle_cp_ram
  jsr rle_copy_ppu
Now, this is the expected output:
Image

And this is the actual output:
Image

I'll give more info on my code if need be, but can anyone see anything wrong that might cause an error in things being copied over?
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

I don't have time to look in-depth at your code, but I noticed that you are only using a byte for the count. So, if your original data has more than 254 repetitions (because you are also using 255 as a terminator), then you will need to accommodate this. Rather than make the count a 16-bit value and consistently waste space, you can just use a "divide and modulus" strategy to make combination entries to create the proper total.

EDIT: (Actually, your code is already handling this just fine!)

EDIT: supplying the actual data you are using for the RLE (2bpp font) would be helpful too.
Last edited by bazz on Sat Apr 09, 2016 7:46 pm, edited 1 time in total.
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: My attempt at RLE compression

Post by tokumaru »

That's an incredibly wasteful variation of RLE, seeing as every non repeated byte gets a "1" inserted in front of it! I wouldn't be surprised if this caused expansion instead of compression in a lot of cases!

The most common solution is to specify runs of uncompressed bytes, just like you represent runs of compressed bytes. You can use bit 7 to specify whether a run is compressed or uncompressed, and bits 0-6 to specify the length, from 1 to 128. Or you can assume that compressed and uncompressed runs will alternate, and use all 8 bits for the length. In this case, you'd need to allow runs of 0 length, which you'd use between two runs of the same type.

Another option is to use auto-triggered RLE, where a length is only specified after 2 (or 3) repeated bytes show up, and more are assumed to come. This assumption might be wrong, in which case the length will be 0.
User avatar
nicklausw
Posts: 376
Joined: Sat Jan 03, 2015 5:58 pm
Location: ...
Contact:

Re: My attempt at RLE compression

Post by nicklausw »

Bazz: good point, but it doesn't affect the file I'm RLE-ing.

tokumaru: thanks for the suggestion, but if I can't even get this basic RLE to work then why over-complicate it more?

Anyway, I fixed the python file, the first byte was copied either once too many times if it's 0 or once too few if not. The bug is still present.

Code: Select all

#!/usr/bin/env python3
# basic rle compression program
# by nicklausw
# public domain.

# this format works as follows.
# one byte is read, and that is
# how many of the next byte to copy over. repeat.
# when the first byte is $ff, the copy is done.

import sys
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("in_file", help="the chr data to be rle'd")
parser.add_argument("out_file", help="the rle'd output")
args = parser.parse_args()

f_in = open(args.in_file, 'rb')
f_out = open(args.out_file, 'wb')

# python doesn't need variable initialization i think,
# but for clarity's sake.
byte_n = int.from_bytes(f_in.read(1), byteorder='little') # byte
byte_c = 0 # byte count

f_in.seek(0) 

for c in f_in.read():
  if c != byte_n or byte_c == 0xFE:
    f_out.write(bytes([byte_c]))
    f_out.write(bytes([byte_n]))
    byte_c = 1
    byte_n = c
  else:
    byte_c += 1

# now catch the last time around
f_out.write(bytes([byte_c]))
f_out.write(bytes([byte_n]))

# the end mark
f_out.write(bytes([0xFF]))

f_in.close()
f_out.close()
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

tokumaru wrote:That's an incredibly wasteful variation of RLE, seeing as every non repeated byte gets a "1" inserted in front of it! I wouldn't be surprised if this caused expansion instead of compression in a lot of cases!
It can only be considered wasteful because people here are generalizing (OP is not designing for any particular dataset or hasn't communicated that) -- and it's been on my mind how I believe a lot of concepts are discussed in this forum "generally" -- when we all know everything changes when a specific project provides actual constraints.

Now is a good time to mention "use-case." I bring this up because there's a difference between designing a general RLE or an RLE specific to a dataset. This same notion applies to almost any other part of development.

Personally, I am going to use RLE for my joypad recordings that run my game's demo mode, and it is likely to ALWAYS have repeats -- in which case, using a bit to specify uncompressed or compressed data would actually be wasteful (waste half a byte per entry). So always consider your use-case if you are designing for some specific dataset.

On the other hand, I'm not sure if there are better compression algos for such a dataset.

Let's also consider the size of the decompression routine in ROM compared to the volume of compressed data we will be using, and ensure there is still a sizable benefit.
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
dougeff
Posts: 2875
Joined: Fri May 08, 2015 7:17 pm
Location: DIGDUG
Contact:

Re: My attempt at RLE compression

Post by dougeff »

Edit...somehow I missed that you were talking about graphic compression for SNES. Never mind.
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

tokumaru: thanks for the suggestion, but if I can't even get this basic RLE to work then why over-complicate it more?
I am proud of you.

I am going to take a closer look at it now
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

no reason to make all of 8 / 16 / 8 bits changes to the accumulator.. just clear hi-8 before start loop

Next -- the problem appears to be in the vram write loop -- you only load the accumulator with a byte, but you are writing a whole word to VRAM, with the high byte as zero.. Is this intentional???? I thought you wanted to copy the bytes directly to VRAM as they are?? But in this case, you are only using them as low bytes of a word.... Since I'm not sure... I'm not going to finish any sort of solution.. But at least have done some refactoring below. (mainly removing all the rep/sep 's) (This could be refactored even further if you don't need the high byte zero at all).

I'm assuming PPUDATA is the VRAM, which is a $2118/$2119 IIRC? Don't forget to have properly configured the increment setting ($2115) (I won't add this in my code.

---- my version (incomplete)

Code: Select all

.proc rle_copy_ppu
  setxy16
  seta8
  ldy #$00
  lda #0	; clear Accum. hi-byte
  xba

 loop: 
  lda (rle_cp_ram), y
  cpa #$ff
  beq done
  tax
  iny
  lda (rle_cp_ram),y
  jsr rle_loop
  iny
  bra loop
  
done:
  rts

; IN: X = count
;      A = byte
rle_loop:
; INCOMPLETE / WILL NOT WORK (for above mentioned reasons in forum post)
; Please finish reading forum post and reply so we can work out your needs
  sta PPUDATA	; this is only writes the low byte.
  dex
  bne rle_loop
  rts
.endproc
-----
So now, let's talk about a solution if you do indeed want to write bytes directly to VRAM (and not just lobytes of a word with the high byte as zero, as you started with). we've identified how it's an issue since writing to VRAM requires maintaining an index of whether we are currently writing to the
low byte or high byte of the VRAM register.. and also, if we do the last transfer, ending on the low byte, we'll need to manually write an empty high byte or (some other solution) to get it transferred.

Because of this added difficulty -- well, you can figure it out if you'd like.. Shouldn't be too hard -- but also consider an alternative of decompressing to RAM and DMA'ing -- once again,
it's just an alternative -- and with no project constraints it doesn't matter (go with VRAM direct, it sounds more challenging :)
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
nicklausw
Posts: 376
Joined: Sat Jan 03, 2015 5:58 pm
Location: ...
Contact:

Re: My attempt at RLE compression

Post by nicklausw »

Is there enough RAM to store 8 KB of data for OAM? Because I'm scared of having to figure out the words and stuff for a direct VRAM copy.

Or is there a way for PPUDATA to just take bytes?
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

> Is there enough RAM to store 8 KB of data for OAM?

s/OAM/VRAM ??? There are 64KB 128KB of SysRam

> Or is there a way for PPUDATA to just take bytes?

There is only a way to have the VRAM word address increment based on lo-byte access or hi-byte access, but there is no single byte-port to write to.

----

Let's just verify that your decompression routine works -- you can decompress into SysRAM, and use a debugger to dump the RAM to file. then compare your decompressed section (you may need to trim in a hex editor) with the original data. (I use vbindiff for this)
Last edited by bazz on Sun Apr 10, 2016 11:53 am, edited 1 time in total.
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
bazz
Posts: 476
Joined: Fri Sep 02, 2011 8:34 pm
Contact:

Re: My attempt at RLE compression

Post by bazz »

Code: Select all

byte_n = int.from_bytes(f_in.read(1), byteorder='little') # byte
byte_c = 0 # byte count

f_in.seek(0)
should be

Code: Select all

byte_n = int.from_bytes(f_in.read(1), byteorder='little') # byte
byte_c = 1 # byte count
There's no point in re-seeking back the byte you've already read.

Caution: I'm not a big python programmer, let alone python3 - so if there's other points of improvement, I can't see them X_X

I tested out your compression program -- it seems to work just fine. And, it does automatically cover > 254 repetitions :)

I used a nifty perl command to print large repetitions of characters for testing (I actually learned this from "Hacking: The Art of Exploitation" like 14 years ago xD)

perl -e 'print "A"x256' > test.txt

Have fun proving your SNES decompression works ^_^
SNES Tutorials (WLA DX)
SNES Memory Mapping Tutorial (Universal / LoROM) -- By Universal I introduce how memory mapping works, rather than just provide a LoROM map.
SNES Tracker (WIP) - Music/SFX composition tool / SPC Debugger
User avatar
tokumaru
Posts: 12106
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: My attempt at RLE compression

Post by tokumaru »

nicklausw wrote:tokumaru: thanks for the suggestion, but if I can't even get this basic RLE to work then why over-complicate it more?
You're right. I actually didn't realize you were having trouble making your current design work until I had typed my entire post, but I ended up posting it anyway. I can't help much with the SNES side of things, so please consider what I wrote as tips for the future.
User avatar
nicklausw
Posts: 376
Joined: Sat Jan 03, 2015 5:58 pm
Location: ...
Contact:

Re: My attempt at RLE compression

Post by nicklausw »

Okay, this routine copies to non-zeropage RAM properly (to $100), but there isn't enough of it available so everything screws up.

You say there's 64 KB of RAM (I said OAM on mistake)? Where? As that would fix everything.

The variables:

Code: Select all

.segment "ZEROPAGE"
  rle_cp_ram: .res 2
  rle_cp_num: .res 2
  
.segment "BSS"
  rle_cp_dat: .res 2 ; a LOT more
The fixed routine:

Code: Select all

.proc rle_copy_ram
  setxy16
  seta8
  ldy #$00
  sty rle_cp_num
  lda #0   ; clear Accum. hi-byte
  xba

 loop:
  lda (rle_cp_ram), y
  cpa #$ff
  beq done
  tax
  iny
  lda (rle_cp_ram),y
  jsr rle_loop
  iny
  bra loop
 
done:
  rts

; IN: X = count
;      A = byte
rle_loop:
  phy
  ldy rle_cp_num
rle_inter:
  sta rle_cp_dat,y
  iny
  dex
  bne rle_inter
  sty rle_cp_num
  ply
  rts
.endproc
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: My attempt at RLE compression

Post by tepples »

The Super NES has 64 KiB of VRAM and 128 KiB of work RAM at $7E0000-$7FFFFF. (The bottom 8 KiB of work RAM is mirrored into the bottom 8 KiB of banks $00-$3F and $80-$BF.) One common pattern is to decompress tile data to work RAM and then schedule a DMA to copy a few KiB to VRAM during the next vertical blanking period.
User avatar
nicklausw
Posts: 376
Joined: Sat Jan 03, 2015 5:58 pm
Location: ...
Contact:

Re: My attempt at RLE compression

Post by nicklausw »

So do I have to do fancy bank switching, or do long references (lda $7fe000) work fine?
Post Reply