While trying to cram as much data in my code as possible, I found that the code compressor sometimes could make better decisions. Here are the two ways in which I believe it could be improved.
First, consider the following string:
--hi-- |
It will be encoded as such by PICO-8:
00 2d 00 2d 14 15 00 2d 00 2d |
But the second “--” could be encoded with a back reference, thus gaining two bytes:
00 2d 00 2d 14 15 3c 04 |
Here 3c04 means “copy 2 chars from position -4”.
It should be easy to change the compressor so that it also emits back references for sequences of length 2 (apparently it only does it for lengths >= 3).
Second, consider the following string:
ababab |
It will be encoded as such:
0d 0e 0d 0e 0d 0e |
However it seems that the following is properly decompressed by PICO-8 and is two bytes shorter:
0d 0e 3c 22 |
Here 3c21 means “copy 4 chars from position -2”. Apparently, even if all those chars haven’t been decoded yet, when the decompressor reaches the 3rd character it does find it. However it’s not a very safe assumption to make without knowing the code.
Just my two cents!
[Please log in to post a comment]