Map data and compression for scrollable map?
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
-
Bregalad
- Posts: 8182
- Joined: Fri Nov 12, 2004 2:49 pm
- Location: Divonne-les-bains, France
It really doesn't allow random acess at all. We shouldn't be talking about the same thing I guess. Huffman uses less bits for common elements, but more bits for uncommon elements. This should work well for maps as well, as some elements are more common. I don't say huffman is better than LZ series, just that personally I have less trouble understanding and implementing it. In the couple of books and tutorials I've seen arround, LZ seemed so much a headache. It would be allright to make a decoder, but horrible to do an encoder.On the other hand with Huffman coding can get random access without much trouble, I've even exploited this fact in the past to parallelize an x86 decoder. Especially if you can afford to store the raw trees directly in ROM.
If someone could proof me wrong I'll be the happyest, as huffman is really annoying having not byte aligned data.
(I use neither in my project anyway right now, I'll probably compress text a little if I have enough of it).
Useless, lumbering half-wits don't scare us.
-
tepples
- Posts: 22994
- Joined: Sun Sep 19, 2004 11:12 pm
- Location: NE Indiana, USA (NTSC)
That's why you borrow someone else's LZ encoder and change the output format to suit your decoder design. That's probably what Nintendo did when adapting an LZSS implementation identical to that from the Allegro library for the GBA BIOS.Bregalad wrote:In the couple of books and tutorials I've seen arround, LZ seemed so much a headache. It would be allright to make a decoder, but horrible to do an encoder.
-
tokumaru
- Posts: 12669
- Joined: Sat Feb 12, 2005 9:43 pm
- Location: Rio de Janeiro - Brazil
That's different, because that's a trade off. I chose to sacrifice space in order to improve performance. In the case of compression you'd be sacrificing space in order to... well, save space! It's not really a trade off! =)Bregalad wrote:Is it much worse than wasting bytes to say "0, 1, 2, 3, 4, 5" etc up to 256 like you said you liked to do ?
And what's worse about the huffman tables is that you'd most likely need different ones for different sets of data, because the same code will simply not work for all pieces of data.
That's LZ77. LZ78 doesn't point to previously output data, instead, it builds an actual dictionary as the data is processed. Each entry is made up of another entry + a new character (making the dictionary very easy to represent in RAM). The dictionary is maintained in a way that the compressor and the decompressor build the same dictionary as they work through the data.The algorythm is pretty interesting, you should look it up.Well, I don't remember very well the algorithms. I just remember you replace a string of bytes by a pointer to a previous reference of the same string,
Of course it happens often. Maps have many repeating patterns, as well as lots of empty (filled with the same tile) areas, both vertically and horizontally. As it's been said, LZ77 even has built-in RLE, so you can be sure it will perform at least as well as RLE, but often much better. RLe is terrible for 2D data, but LZ can easily copy data from the previous row, taking some advantage of repeating vertical patterns.which isn't going to happen very often on a map.
I feel that Huffman is pretty bad when it comes to maps, because there is a lot of spacial redundancy that is simply not used. Huffman cares about how many times each value is used, but doesn't give a damn about where they are used, totally ignoring spacial redundancy. Maybe a hybrid of Huffamn and LZ would work well. And I don't mean one algorithm applied on top of the other like is commonly seen, but actually using variable-sized codes along with repeat counts and pointers.
An encoder is not hard to make at all if you don't care about speed. And why would you? I see nothing wrong with letting a completely unoptimized encoder working for a couple minutes (although even that sounds like too much) while you surf the web or something, waiting for it to finish. What matters is decompression, because that's what will be in you final product.Bregalad wrote:In the couple of books and tutorials I've seen arround, LZ seemed so much a headache. It would be allright to make a decoder, but horrible to do an encoder.
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
What I meant was that you can start decompressing directly from any random byte in a Huffman stream as long as you've got a bit-pointer to it. The bytes are completely independent of each other, as opposed to LZ coding where any given piece of data depends on everything coded before it.Bregalad wrote:It really doesn't allow random acess at all. We shouldn't be talking about the same thing I guess.On the other hand with Huffman coding can get random access without much trouble, I've even exploited this fact in the past to parallelize an x86 decoder. Especially if you can afford to store the raw trees directly in ROM.
To be honest I had quite a hard time wrapping my head around Huffman coding myself, the whole tree-building thing and why it works is just plain weird ;)Bregalad wrote:I don't say huffman is better than LZ series, just that personally I have less trouble understanding and implementing it. In the couple of books and tutorials I've seen arround, LZ seemed so much a headache. It would be allright to make a decoder, but horrible to do an encoder.
Basic LZ77 coding is straightforward by comparison. The idea is to go through the file from start to end and at each byte search for the longest match (behind the cursor) to the current data. With a brute-force search you'd start at each byte before the cursor and test just how many bytes match, then keep track of the longest one found. If nothing is found, or if the longest match is still too short save any space, then write a literal byte instead.
You've undoubtedly seen some pseudo-code already in the aforementioned tutorials but in practice an implementation might look kind of like this:
Code: Select all
void lz(const char *begin, const char *end) {
for(const char *cursor = begin; cursor != end; ) {
const char *match;
size_t match_len = 0;
for(const char *seek = cursor; seek-- != begin; ) {
for(size_t i = 0; &cursor[i] != end; ++i) {
if(seek[i] != cursor[i]) {
break;
}
}
if(i > match_len) {
match = seek;
match_len = i;
}
}
if(match_len < MIN_MATCH_LEN) {
emit_literal(*cursor++);
} else {
emit_match(cursor - match, match_len);
cursor += match_len;
}
}
}To achieve better compression than that most coders resort to bit-packing things such that nearer match offsets and shorter run lengths use fewer bits. Formats like ZIP and GZIP even use Huffman to coding on them.
You may think that a triple-loop like this would be unbearably slow but a naive search like this is actually what most native C64 packers use. As long as you put some limit on match lengths and match offsets and don't try to compress a megabyte of zeros anyway.
In the end even with a fair bit of bit-twiddling, some simple hashing to speed up the search process, and so-called optimal parsing to get slightly better matches my own encoder (the one I linked to above) only weighs in at 700 lines or so. Believe me, it's not exactly rocket science.
Most people on this forum probably know this old trick already but I figured I'd repeat it once more since it's immensely useful when dealing with bit-streams. The idea is to set up an 8-bit shift-register in memory from which we can fetch bits into carry with ASL/LSR instructions. The clever part is that if we shift in a one in the least-significant bit whenever the register is refilled we can detect when the register is empty simply by checking the zero flag after a shift, that is it'll only be zero if that one we forced in at the beginning has been shifted all the way out.Bregalad wrote:If someone could proof me wrong I'll be the happyest, as huffman is really annoying having not byte aligned data.
Code: Select all
bitbuf = $00 ;initialized to $01
getbit:
asl bitbuf
bne +
jsr getbyte
sec
rol
sta bitbuf
+ rts(Writing can be handled similarly of course, that is by rotating carry bits into a shift register initialized with the least-significant bit set and flushing the byte when carry is set afterwards, but I doubt whether writing is of much use on the NES.)
-
tomaitheous
- Posts: 592
- Joined: Thu Aug 28, 2008 1:17 am
LZ77 isn't as efficient as LZSS. Having to encode uncompressed data with run lengths can really bulk up smaller runs of data. I like LZSS +1bit per element of uncompress data method better. I don't think I've actually used the original spec of LZ77.
Not so much for NES, but SNES, Genesis, and Turbo Duo and others that have WORD size tilemap elements really benefit from a variable elements size RLE system. One control code for BYTE RLE and another for WORD RLE.
Not so much for NES, but SNES, Genesis, and Turbo Duo and others that have WORD size tilemap elements really benefit from a variable elements size RLE system. One control code for BYTE RLE and another for WORD RLE.
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
Whether run-lengths for literals makes sense or not would depend on precisely how you encode things. For something like ZIP/GZIP DEFLATE where the literal and match-lengths are combined into a single Huffman symbol it seems natural not to bother with run lengths.tomaitheous wrote:LZ77 isn't as efficient as LZSS. Having to encode uncompressed data with run lengths can really bulk up smaller runs of data. I like LZSS +1bit per element of uncompress data method better. I don't think I've actually used the original spec of LZ77.
In my project the run lengths are written using elias gamma coding, and the LZSS-style indicator bit is skipped after a literal run since it wouldn't make sense to have two consecutive literal sequences. The end result is that I break even or save space for anything except two-byte literals, and save quite a bit of space for longer sequences (not to mention speeding up the decoder.)
For the record the match offsets are encoded as a two-bit prefix indexing a table of possible offset widths, followed by the actual offset data of course. Also two-byte matches are special-cased to use a shorter offset table since the largest offsets lengths wouldn't actually save any space otherwise.
(By the way this encoding straight from ByteBoozer, a C64 packer. Though I've shuffled the bits around and tweaked things to allow me to write an optimized decoder.)
-
tepples
- Posts: 22994
- Joined: Sun Sep 19, 2004 11:12 pm
- Location: NE Indiana, USA (NTSC)
Classic LZSS (as seen in Allegro and GBA) does the same thing; it just encodes the lengths of literal runs in unary, a special case of Golomb codingdoynax wrote:In my project the run lengths are written using elias gamma coding, and the LZSS-style indicator bit is skipped after a literal run since it wouldn't make sense to have two consecutive literal sequences.
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
Wouldn't that be equivalent to simply writing a single bit to indicate whether a match or literal byte follows?tepples wrote:Classic LZSS (as seen in Allegro and GBA) does the same thing; it just encodes the lengths of literal runs in unary, a special case of Golomb coding ;-)doynax wrote:In my project the run lengths are written using elias gamma coding, and the LZSS-style indicator bit is skipped after a literal run since it wouldn't make sense to have two consecutive literal sequences.
-
tepples
- Posts: 22994
- Joined: Sun Sep 19, 2004 11:12 pm
- Location: NE Indiana, USA (NTSC)
Yes. I just presented an alternate interpretation that shows exactly how your method differs from the most common LZSS implementation. Gamma and unary codes just imply different assumed distributions of literal run lengths. A universal code, such as Elias gamma, Fibonacci, or ternary, implies a power law, while a Golomb code such as unary implies an exponential distribution. Would it be worth it to try compressing some real data with LZSS and taking statistics on the distribution of literal run lengths?doynax wrote:Wouldn't that be equivalent to simply writing a single bit to indicate whether a match or literal byte follows?tepples wrote:Classic LZSS (as seen in Allegro and GBA) does the same thing; it just encodes the lengths of literal runs in unary, a special case of Golomb codingdoynax wrote:In my project the run lengths are written using elias gamma coding, and the LZSS-style indicator bit is skipped after a literal run since it wouldn't make sense to have two consecutive literal sequences.
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
I tried this when trying out different encodings for my packer. I didn't keep the statistics but gamma coding was a clear win on the files I tested, not that it made much of a difference either way (perhaps a single percent or so.)tepples wrote:Would it be worth it to try compressing some real data with LZSS and taking statistics on the distribution of literal run lengths?
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
So I decided to hack the independent literals back into the encoder (further proof that I've got way too much free time on my hands.) Anyway, it seems like a did a poor job of testing things originally.
It turns out that the literal runs expands every single file in the Calgary Corpus, except for 'obj1' on which it saves four bytes and 'pic' which I didn't bother to wait for. On the other hand it consistently saves space (typically between 0.5% and 2%) on my old 8-bit projects, not to mention the half dozen NES ROMs I checked.
So now I don't know what to think anymore. It would be awesome if I could strip out this feature as it complicates things quite a bit and bloats the decruncher. Still, that extra kilobyte or two might come in handy one day, and it's nice to be able to cope with incompressible data gracefully. It couldn't hurt to know how it affects the decoder's performance either.
http://doynax.googlepages.com/lz_run_test.zip
It turns out that the literal runs expands every single file in the Calgary Corpus, except for 'obj1' on which it saves four bytes and 'pic' which I didn't bother to wait for. On the other hand it consistently saves space (typically between 0.5% and 2%) on my old 8-bit projects, not to mention the half dozen NES ROMs I checked.
So now I don't know what to think anymore. It would be awesome if I could strip out this feature as it complicates things quite a bit and bloats the decruncher. Still, that extra kilobyte or two might come in handy one day, and it's nice to be able to cope with incompressible data gracefully. It couldn't hurt to know how it affects the decoder's performance either.
http://doynax.googlepages.com/lz_run_test.zip
-
Bregalad
- Posts: 8182
- Joined: Fri Nov 12, 2004 2:49 pm
- Location: Divonne-les-bains, France
Oh my all of that is so much a headache !!
After all you're right huffman can allow semi-random acess if you keep a pointer (with bit precision) you can continue read data. But it's still sequential acess because you need to read all the first bytes at least once in order to make your pointer.
I'd definitely look into the LZ series again, but they seem so much a headache ! If one use one more bit to represent if data is a literal or a sting and that 80% of the data will be literals (this is likely to be your case unless you intentionally repeat things a lot in the original), then I dout anything gets compressed. You'd better reserve a special value for a string. But then how do you know what the string is refering to ?
After all you're right huffman can allow semi-random acess if you keep a pointer (with bit precision) you can continue read data. But it's still sequential acess because you need to read all the first bytes at least once in order to make your pointer.
I'd definitely look into the LZ series again, but they seem so much a headache ! If one use one more bit to represent if data is a literal or a sting and that 80% of the data will be literals (this is likely to be your case unless you intentionally repeat things a lot in the original), then I dout anything gets compressed. You'd better reserve a special value for a string. But then how do you know what the string is refering to ?
Useless, lumbering half-wits don't scare us.
-
Dwedit
- Posts: 5259
- Joined: Fri Nov 19, 2004 7:35 pm
Apack takes 7 bits to encode any single byte which has appeared up to 15 bytes ago (or a literal zero). Of course, you can optimize a compression system to reduce the number of decision bits before you take 4 bits of near history.
But I think that explicit dictionary compression is a good idea since you get random access that way.
But I think that explicit dictionary compression is a good idea since you get random access that way.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
-
doynax
- Posts: 162
- Joined: Mon Nov 22, 2004 3:24 pm
- Location: Sweden
No one claimed compression was easy. Well, I might have at one time or another, but I don't think I ever truly meant it.Bregalad wrote:Oh my all of that is so much a headache !!
Random access might have been the wrong word, stateless access might be a better description.After all you're right huffman can allow semi-random acess if you keep a pointer (with bit precision) you can continue read data. But it's still sequential acess because you need to read all the first bytes at least once in order to make your pointer.
You'd save a set of "bookmarks" at regular intervals ahead of time, from which decompression can be resumed. Perhaps one per row as in the RLE schemes mentioned earlier. Note that you don't necessarily have to save these pointers in the ROM itself. It's entirely possible to scan through the whole map when loading a zone and save the proper pointers that way.
That depends on the data. Certain file types are simply not compressible with lossless dictionary coders, such as previously compressed data, sampled audio or photographic images. On the other hand the types of data commonly used in 8-bit projects (like machine code, pixelled graphics, synthesized audio or text) tends compress quite well.I'd definitely look into the LZ series again, but they seem so much a headache ! If one use one more bit to represent if data is a literal or a sting and that 80% of the data will be literals (this is likely to be your case unless you intentionally repeat things a lot in the original), then I dout anything gets compressed. You'd better reserve a special value for a string. But then how do you know what the string is refering to ?
For instance my "game" is currently reduced nearly by half (from 13,428 down to 6,977 bytes) of which 63% are matches (9798 match vs. 3628 literal bytes.) YMMV, so feel free to try out this LZ coder or another one on your own data before deciding anything.
I'm not quite clear on what you meant by "strings." A match, that is a backwards reference as opposed to literal data, is stored as a length together with an offset pointing backwards in the file. While literals are stored as a length together with the actual literal data. And before each match/literal there's a bit indicating which one to expect (technically only after matches actually.) If instead I switch to only allowing single literals instead of runs I still only lose a 70 bytes or so, and apparently many larger files save space on such a coding.