Map data and compression for scrollable map?

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden

Post by doynax »

tomaitheous wrote:I'm curious if anyone else has seen fixed "seeded" buffers for LZSS decompression, before the decompressor starts? The decompression engine knows the LZ window is primed with a series of values before hand and can/will access them from the very start. I saw it once in a game (it was a 4k ring buffer) and looked like it helped out for structured patterns and data like tilemaps.
You could use the LZ78/LZW family of algorithms with fixed dictionaries, as opposed to LZ77/LZSS sliding-window variants.
To get fine-grained random access to a large map you probably want use a share a fixed dictionary in ROM for all of the blocks anyway, as small 256-byte (or thereabouts) blocks are too small for LZ77 to function effectively.
tomaitheous wrote:Also, if anyone's interested, pucrunch sets up nicely for a ring buffer setup. The standalone decompresser is already written in 6502, you just need to make some minor modifications. Chris Covell has already adapted it for rom based projects, shouldn't be to much trouble to move it back over to 6502 from 6280. It works fairly decent for 256 and 512 byte ring buffers. If you don't know, a ring buffer setup allows you to decompress an LZ stream directly to an I/O port.
I offer my sliding-window compressor as well, if anyone is interested. It decompresses quite a bit faster than Pucrunch or Exomizer and gets compression rates somewhere between the two.
Though, again, if all you can spare is a 256-512 byte buffer then a fixed dictionary is probably a better solution.
tepples wrote:
doynax wrote:Hm. Presumably you'd pre-calculate/store a pointer to the start of each such 16x16 block for "random access." And avoid decoding partial blocks by extending the screen buffer by 16 tiles on each axis
That wouldn't be entirely necessary. You might traverse the quadtree a tile at a time, much like in tokumaru's decoder.
I don't follow. Are you talking about decoding individual tiles within the block without having to process the whole thing?
You'd need to put in pointers to the internal section to do that.
tepples wrote:
doynax wrote:And to improve locality lets store the tiles within these blocks in an alternative order, say according to a space-filling curve (note that the rearrangement can be trivially implemented with some unrolled copying code.)
RLE on a space-filling curve, such as the Hilbert curve, tends to look a lot like a quadtree.
Well, yeah, that was the general idea ;)
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am

Post by tomaitheous »

I offer my sliding-window compressor as well, if anyone is interested. It decompresses quite a bit faster than Pucrunch or Exomizer and gets compression rates somewhere between the two.
I'm always on the look out for different compression schemes and variants :D Got a link?
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden

Post by doynax »

tomaitheous wrote:
I offer my sliding-window compressor as well, if anyone is interested. It decompresses quite a bit faster than Pucrunch or Exomizer and gets compression rates somewhere between the two.
I'm always on the look out for different compression schemes and variants :D Got a link?
I just took another look at it and it's not trivially ROMable. I'd have to put some thought into how it first to come up with a reasonably fast scheme, sorry. Anyone thinking about doing some FDS development?

I've been thinking about writing an LZW coder instead. I don't remember seeing one for the 6502 before, presumably because it wastes memory on an RAM-based system.

http://doynax.googlepages.com/lz.zip (some credit should go to ByteBoozer which the format is based on)
User avatar
tokumaru
Posts: 12669
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

doynax wrote:I've been thinking about writing an LZW coder instead. I don't remember seeing one for the 6502 before, presumably because it wastes memory on an RAM-based system.
LZ78/LZW need extra memory just for the dictionary, while LZ77/LZSS make direct copies of previously decompressed data. I doubt any 6502 system has enough RAM to implement a LZW dictionary.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden

Post by doynax »

tokumaru wrote:
doynax wrote:I've been thinking about writing an LZW coder instead. I don't remember seeing one for the 6502 before, presumably because it wastes memory on an RAM-based system.
LZ78/LZW need extra memory just for the dictionary, while LZ77/LZSS make direct copies of previously decompressed data. I doubt any 6502 system has enough RAM to implement a LZW dictionary.
The point here is that the dictionary can be stored in ROM rather than RAM. So you could unpack directly to CHR-RAM without a buffer on the NES-side. The other advantage is that you can decompress small blocks in the middle of a file without decompressing everything up to that point, or splitting it up into independent blocks, as with LZ77.

Then again I've never actually implemented such a coder, so just how effective it would be remains to be seen.
tepples
Posts: 22994
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)

Post by tepples »

tokumaru wrote:LZ78/LZW need extra memory just for the dictionary
So does any game engine that uses metatiles, including yours.
tokumaru wrote:I doubt any 6502 system has enough RAM to implement a LZW dictionary.
In other words, you doubt that giffy.prg, a C64 GIF viewer, works.
User avatar
tokumaru
Posts: 12669
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

tepples wrote:So does any game engine that uses metatiles, including yours.
I meant RAM, lots of RAM.
tokumaru wrote:In other words, you doubt that giffy.prg, a C64 GIF viewer, works.
Yeah, maybe I was a little extreme. Once I tried to create an LZW decoder (not 6502 related at all) and I managed to make the dictionary use around 20KB. Maybe the C64 can handle that... doesn't it have 64KB of RAM?.

I haven't looked into LZW in a while, but I guess it may be possible to make something even for the NES, if the amount of data is kept within a limit or something, to control how large the dictionary can get.

Having a static dictionary in ROM is pretty weird for LZW...
tepples
Posts: 22994
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)

Post by tepples »

But I do know of a couple dictionary-based codecs that use a static dictionary. One is metatiles. Another is the text codec "huffword", which first encodes each possible sequence of non-whitespace characters to an integer and then uses an entropy coder (e.g. Huffman) on those integers.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden

Post by doynax »

tokumaru wrote:
tepples wrote:So does any game engine that uses metatiles, including yours.
Having a static dictionary in ROM is pretty weird for LZW...
Ah, it appears that LZ78/LZW still dynamically build rebuild the dictionary at runtime. I simply meant a static dictionary coder, whatever the proper term for that may be.

Whether a static dictionary method can ever hope to compete with the dynamic ones is another matter. Presumably someone, somewhere, has researched this already and I just have to figure out the magic Google keywords.
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am

Post by tomaitheous »

I simply meant a static dictionary coder, whatever the proper term for that may be.
I've seen similar token style setups like that. A string is replaced with a token that points to the source location / dictionary instance (ROM). The first instance in memory has a terminator at the end of it. On a normal read, the terminator is ignored. On a token call, the terminator ends the call. If setup correctly, you can get full random access and all without a need for separate buffer. Though you'll need a marker/position map to know where the offset starts for the specific strings you need to read out.

It's faster than LZ variants and doesn't necessarily require a temp buffer. Though it doesn't have the advantage of I/O to I/O decompression that LZSS variants offer.
User avatar
Bregalad
Posts: 8182
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Post by Bregalad »

As far I know LZ series compression are quite complex, not too hard to decode but very hard to encode. Also they're not too efficient unless you have lot of data repeating itself, which isn't likely to be the case in data that is already organized to be compact.

Huffman is more intuitive and easier to understand (for me at least), and it's not too hard to encode and decode (but it's not really easy either). All it needs to decompress is an array of pointers into ROM. If compressed data is smaller than original data by an amount greater than the array's size, then it's work implementing I guess.
Useless, lumbering half-wits don't scare us.
User avatar
tokumaru
Posts: 12669
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Bregalad wrote:LZ series compression are quite complex, not too hard to decode but very hard to encode.
You should know that LZ77 is way different than LZ78, so you can're really put them in the same "series". The compression in both schemes is not necessarily hard, it only gets more complex as you try to produce optimal compressed streams. The basic implementation is pretty simple.
Also they're not too efficient unless you have lot of data repeating itself, which isn't likely to be the case in data that is already organized to be compact.
But that's the case with pretty much every compressing scheme. Most won't perform well on data that is already compressed in some other way.
Huffman is more intuitive and easier to understand (for me at least), and it's not too hard to encode and decode (but it's not really easy either).
IMO, the gains from huffman compression are very modest. This is why it's rarely used by itself, while other schemes (most dictionary-based) are (were?) often used by themselves. Huffman is usually used to get that last drop of compression you can squeeze out of data that was already compressed in some other way.
If compressed data is smaller than original data by an amount greater than the array's size, then it's work implementing I guess.
...and there's still the major overhead that is the table needed for decoding. There is an adaptive variation of huffman that doesn't need a table, but it'd probably be hard to implement in systems with little RAM.
User avatar
Bregalad
Posts: 8182
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Post by Bregalad »

...and there's still the major overhead that is the table needed for decoding. There is an adaptive variation of huffman that doesn't need a table, but it'd probably be hard to implement in systems with little RAM.
Well that's the array I was refering too. In a 8-bit implementation it would typically not be more than about 300-600 bytes I guess. Is it much worse than wasting bytes to say "0, 1, 2, 3, 4, 5" etc up to 256 like you said you liked to do ?
You should know that LZ77 is way different than LZ78, so you can're really put them in the same "series". The compression in both schemes is not necessarily hard, it only gets more complex as you try to produce optimal compressed streams. The basic implementation is pretty simple.
Well, I don't remember very well the algorithms. I just remember you replace a string of bytes by a pointer to a previous reference of the same string, which isn't going to happen very often on a map.

Huffman works well on text, but not very well for other data. I guess LZ series were also developped with text in mind. Data is hard to be compressed by more than one algorithm (maybe if one of them is RLE). And of course the best is to come up with your own copression sheme, but we're not all doctors in computer science.
Useless, lumbering half-wits don't scare us.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden

Post by doynax »

Bregalad wrote:
tokumaru wrote:You should know that LZ77 is way different than LZ78, so you can're really put them in the same "series". The compression in both schemes is not necessarily hard, it only gets more complex as you try to produce optimal compressed streams. The basic implementation is pretty simple.
Well, I don't remember very well the algorithms. I just remember you replace a string of bytes by a pointer to a previous reference of the same string, which isn't going to happen very often on a map.
I've gotten very good results from LZ on the maps I've tried. For one thing LZ77 handles RLE sequences quite efficiently (with overlapping references.)
Aside from that I've never managed to write even a semi-efficient 6502 Huffman decoder, nothing close to a good LZ implementation anyway. The usual table-based optimizations just aren't applicable here. Perhaps with a small alphabet, something on the order of a few dozen tiles, a hand-rolled decoder would be fast enough.
As for complexity, well, that would depend on the kind of refinements you want I suppose. But if you check out a tutorial I think you may discover that LZ coding is less scary than you think.

On the other hand with Huffman coding can get random access without much trouble, I've even exploited this fact in the past to parallelize an x86 decoder. Especially if you can afford to store the raw trees directly in ROM.
tepples
Posts: 22994
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)

Post by tepples »

Bregalad wrote:Well, I don't remember very well the algorithms. I just remember you replace a string of bytes by a pointer to a previous reference of the same string, which isn't going to happen very often on a map.
But as doynax pointed out, run-length encoding is just a special case of LZ77 with copy distances restricted to 1 word. So your horizontal run of grass tiles will show up as a single byte, then a run whose pointer is to the tile you just wrote.
Huffman works well on text, but not very well for other data.
Huffman works great for any where some integer values are significantly much more common than others. For example, in an RPG overworld map, there are more spaces set to water than road, and more spaces set to road than village. And if you dedicate one encoded value to "copy the tile north of this one", there will be a metric booty-load of that value, which represent elements in vertical runs.