Log In  
Page:
1
2


Cart #px9-11 | 2024-11-05 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
77

PX9 is a lightweight gfx & map compression library, intended to replace PX8. It uses the same ideas and interface as px8, but is smaller (297 292 274 262 258 254 234 215 tokens to decompress), and requires zero configuration.

To compress some data:

px9_comp(x,y,w,h, dest_addr, vget)

returns the number of bytes written

x,y,w,h is the source rectangle (e.g. on the spritesheet or map)
dest_addr is where to write it in memory
vget is a function for reading values (e.g. sget when compressing from spritesheet)

To decompress it again:

px9_decomp(x,y,src_addr,vget,vset)

x,y where to decompress to
src_addr is a memory address where compressed data should be read from
vget, vset are functions for reading and writing the decompressed data 
  (e.g. pget and pset when decompressing to the screen)

Unlike px8, the vget function does not need to return 0 when x,y are outside the destination rectangle

Workflow

You can use px9.p8 as a utility for compressing other carts' data by replacing _init() with something like this:

reload(0x0, 0x0, 0x2000, "mycart.p8")
clen = px9_comp(0, 0, 128, 128, 0x2000, sget)
cstore(0x0, 0x2000, clen, "mycart_c.p8")

This would compress the spritesheet from mycart.p8 and store it in the spritesheet of mycart_c.p8, using memory from 0x2000.. as a temporary buffer. See the PICO-8 Manual for more information about reload() and cstore().

In the release cartridge, you only need the code from tab 1 (px9_decomp) in order to decompress, which should be 292 tokens.

The Algorithm

// expand for an interactive visualiser and technical details.

PX9 makes a prediction for each value based on the upper neighbour (x,y-1). For each upper neighbour value, a list of predictions is stored in the order they were most recently encountered (a rough metric for likelyhood). These can be seen at the bottom left of the visualisation.

After each source value is encountered, it jumps to the top of the list, and also to the top of a shared list (EL "empty list") that is used when the predictopms list for that upper neighbour value is empty. The first time each upper neighbour value is encountered, its preduction list is initialised to EL.

When decoding, the same lists are maintained so that it is sufficient to store indexes into those lists to reproduce the source data (top right in the gif). EL is initialised to every unique value in the source data in the order they appear, and is stored in the encoded header. So the first span is always predicted, as the first value in the image is always at index 0 in EL.

For typical data, most of the index values are 0 ("predicted"), and otherwise small index values are needed ("unpredicted"). To exploit this characteristic, PX9 stores alternativing spans of predicted and unpredicted values. For the predicted spans, only the length of the span is needed, as all of the list indices in that span are known to be 0. For the unpredicted spans, all of the indices are > 0, and so n-1 can be stored for each pixel.

In the gif, the list of predictions for each upper colour starts empty, and after the first step (colour 0, upper neighbour 0), EL is copied into list 0. That list is used for the entire top row, as the upper neighbour is out of bounds and 0 is used. On the second row, a new upper neighbour colour is encountered (green), and EL is copied to the green list. Those two lists are used until white is encountered on the 3rd row, and so on. Note that all of the list indices at the top right are quite small. This means they can be stored in a compact format that favours small numbers.

Press X to step through (or Z/C to fast forward):

Cart #px9v-0 | 2024-11-05 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
77

Both span lengths and indices are stored in the same way: a sequence of n-bit little-endian integers where n = {1,2,3..}. The stored value is taken to be the sum of these integers, and the list is terminated when the last integer has at least one bit that is not set. i.e. is less than (2^num_bits)-1

So, values are stored in the bit stream like this:

0: 0
1: 1 00
2: 1 10
3: 1 01
4: 1 11 000
5: 1 11 100
6: 1 11 010

This distribution of encoded lengths works well for pixel&map values and span lengths, as both predictions and near-predictions (index==1) can be stored with single bits, and typical source data roughly produces roughly a log2n distribution in many cases.

Version History


v3:
felice's getval() replacement
fixed px9_comp() return value (was returning one larger than needed when aligned to 8bit boundary)

v4: // More improvements by @Felice & @Catatafish
Fixed the bit-flushing bug at EOF
Perf should be better in 0.2.0
Down to 274 tokens

v5:
Fixed: output is skewed when x != 0
Down to 262 tokens for px9_decomp

v6:
smaller vlist_val() by @p01
-> 258 tokens

v7b:
smaller vlist_val() by @Felice
-> 254 tokens

v8:
smaller vlist initialization by @pancelor
-> 241 tokens

v9:
new bitstream order by @pancelor
-> 234
-> 234 tokens (but ~4% slower)

v10:
@pancelor:
remove cruft
clever getval() tricks
fix low-entropy bug
-> 215 (and matches OG decomp speed)
@zep: added tests (tab 3)

v11:
@Felice:
removed unnecessary brackets
-> 214 tokens

77


1

This is really impressive. I imagine this would be super useful for people like @nextlevelbanana in making slideshow presentations based in PICO-8! :D


2

@zep

(Hiding this because it's off-topic and I don't want to derail the thread.)


I just want to take this opportunity to point out that leaving the tab-width default at 1 has the hazard that people, like you yourself, don't differentiate between tabs and spaces, and so they write code that intermingles the two arbitrarily.

This PX9 code you've uploaded here looks strange when loaded with tab-width set to 2. Try it yourself, you'll see. Or just click Code under the web player above, where the tab width is 4. Indenting is all over the place.

Do you really think it's better to doom future carts to loading weirdly, rather than dooming some older carts to loading weirdly? I'd favor the future, personally.

If you set the default to 2 spaces, this won't happen nearly as much. People will be able to see the difference and will more actively use spaces instead of tabs if they want 1-space indenting.

(Side note: A "visible whitespace" option would be nice.)


Oh, and sorry for the derailment. To others: please don't respond to this one way or the other, I only wanted to point this out in a way that was ... conveniently visible, let's say ... to zep. In fact, maybe I should put this in hidden tags. Yeah, I'll do that.


Nice! That token saving is not to be sniffed at. Although px8 seems to do a better job at compressing the image in this cart - 1274 (px9) vs 1092 (px8) bytes?

It fared better against other image data I tried though (for some 128x128 sort-of perlin noise it did about 2.5kb vs px8's 2.9kb).


@zep

I replaced the logic for getval() in the decompressor and saved 4 tokens. I think it also might be slightly faster, especially when fetching larger bit strings, but not much.

(I also unified the tabs, just because it was driving me nuts.)



Nice one @Felice -- I included this version in px9-3.

@Skulhhead1924 The first two version (0, 1) were tests that I didn't post yet. Not sure what you meant, but you can always load the most recent version one from PICO-8 commandline with:
> LOAD #PX9

@Catatafish That's.. actually quite a surprising difference! I haven't done a thorough comparison, but in general their performance is similar, on average. It might be possible to sometimes squeeze a bit more out of px9 by using thing's like px8's remap() function (which reads pixels 8x8 sprite by sprite), but I imagine the smaller token count is more valuable.


1

I squeezed a few more tokens out of the decompressor, down to 277 in total. I had to minify it a bit so it'd actually fit in a project I'm using this in, sorry.

function px9_decomp(x0,y0,src,vget,vset)
 local function vlist_val(l,val)
  for i=1,#l do
   if l[i]==val then
    for j=i,2,0xffff do
     l[j]=l[j-1]
    end
    l[1]=val
    return i
   end
  end
 end
 local cache,cache_bits=0,0
 function getval(bits)
  if cache_bits<16 then
   cache+=lshr(peek2(src),16-cache_bits)
   cache_bits+=16
   src+=2
  end
  local val=lshr(shl(cache,32-bits),16-bits)
  cache=lshr(cache,bits)
  cache_bits-=bits
  return val
 end
 -- gnn?
 function gn1(tot)
  local bits=1
  while 1 do
   local vv=getval(bits)
   tot+=vv
   if (vv<2^bits-1) return tot
   bits+=1
  end
 end
 local w,h,b,el,pr,splen,mode=gn1"1",gn1"0",gn1"1",{},{},0
 for i=1,gn1"1" do
  add(el,getval(b))
 end
 for y=y0,y0+h do
  for x=x0,x0+w-1 do
   splen-=1
   if (splen<1) splen,mode=gn1"1",not mode
   local a=y>y0 and vget(x,y-1) or 0
   local l=pr[a]
   if not l then
    l={}
    for e in all(el) do
     add(l,e)
    end
    pr[a]=l
   end
   local v=l[mode and 1 or gn1"2"]
   vlist_val(l,v)
   vlist_val(el,v)
   vset(x,y,v)
   x+=1
   y+=flr(x/w)
   x%=w
  end
 end
end

Also, if you swap the serialization order of h and el b in the compression routine you can factor out h & save 2 more tokens in the decompressor.

edit: got my variables mixed up


@Catatafish Can you explain why a couple of your replaced gn1 calls start with tot==0 or 2 instead of 1?


The gn1"2" version is an optimization for gn1()+1 that rolls the +1 into the call. The h=gn1"0" version is the same thing, done to avoid a later use of h-1.


I would love to check out the code, but PICO-8 "CANNOT CONNECT TO BBS" for some reason. is the code snippet you posted in the description all I need to compress and decompress? and am I too late to make PICO-Art? (Hee Hee! PICO-8 PICO-Art!) Thanks!


Hi -- the site was down for a few moments while doing some maintenance. Please try again now!

yes, those snippets plus the contents of the cart are all you need (tab 1 for decompression and tab 2 for compression)


That was fast. I accidentally closed this tab and when I reopened the tab I saw your comment. Anyway, I tried to run a game I have never played, unable to connect to bbs. I try to download a game, download failed. I also tried reinstalling both pico-8s (yes, I have two PICO-8s on the sidebar (mainly cuz I have two screens)) but, nope. thanks for speed-replying!


I wonder why all my comments are so long.


Oh, and it didn't work on my second pico-8 either.


wow, 3 comments in a row that's a first. oh after I post this it will be four.


and for the PICO-Art, can I use the SECRET COLORS?


Ooh! just noticed the new CIRCLE TOOL! gonna be a big help in art.


Hey @zep, I think I've encountered a bug where the returned length of compressed data is reported 1 byte too short, and I was chopping off this extra byte which lead to some corruption when decompressing.

Just incrementing clen immediately after compressing works around this issue (but sometimes uses 1 extra unnecessary byte I think).

I may be able to come up with an example if that's helpful.


5

I have a version of this that's been updated for 0.2.0 functionality/performance and has the optimizations @Catatafish made above.

Now that @mrh has pointed out the bit-flushing bug at the end of the file, I've also fixed that in my version.

I don't mean to step on @zep's toes, but I think I should just offer up this version, since it's a rollup of everything that could have been done to PX9 since zep's last official version.

Notes:

  • Functionality is should be identical
  • Fixed the bit-flushing bug at EOF
  • Perf should be better in 0.2.0
  • Down to 273 tokens
  • Some code rework because I'm OCD (sorry zep)

Here's the cart:

Cart #mayobagegu-0 | 2020-05-29 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
5


1

Thanks @mrh, I just updated the cart with @Felice's fix (cheers Felice!)


2

First off, thanks for the awesome tool!

Gratuitous feature request:

Is there an easy way to extend this to compress to and decompress from strings? (e.g. the decompress function could use a vget function to access the string, but the compress function just stores to cart memory.)
With the vget vset functions, can this be extended to compress to or decompress from strings?

1) compress from spritesheet to file on disk using (printh str [filename] [overwrite])
2) decompress from string in code section to spritesheet.

--end gratuitous feature request that I am just being too lazy to code myself


1

I've updated to the new version, removing my workaround, and everything seems to be happy. Thanks for the bugfix @Felice & @zep.

The compression is working nicely and I'm going to be able to cram a lot of content into this cart by the looks of things.


Thanks for testing, @mrh, glad to hear I didn't break it. :)

By the way, I typo'ed the token count above. I said it was down to 274 but it's actually 273. Hey, every token counts, right? :)


2

@electricgryphon, @zep, et al

I wrote a version of the decomp that can read from a base128 string or from memory, depending on whether its passed a string or a numeric address as the src. It's 320 tokens vs. 273 for just memory, which is a fair bit more, so here's a cart with all three versions of the decompressor, with a compressor that can write to memory or strings.

See the first tab for updated usage details in the big comment block, but here's the TL;DR:

  • 320 tokens for px9_decomp(), which can now take an address or a string.
  • 273 tokens for px9_mdecomp(), which was the existing memory-only decompressor.
  • 283 tokens for px9_sdecomp(), which is a string-only decompressor.
  • Compressor can now take nil as a dest addr, causing it to return a string instead of compressed len. Use #str to get compressed len.

Gryph, regarding writing to disk, you'd obviously just printh() the resulting string yourself.

Edit 1: Hotfix for a bug @electricgryphon reported below.

Edit 2: After investigation, we've found that some images encoded as raw hex strings will produce smaller carts after .p8.png compression than using PX9-compressed strings. The sample image in zep's cart is smaller via PX9 string, while @electricgryphon's test image below is smaller via raw hex string. The hot take here seems to be that an image with a lot of patterns will make a smaller cart with raw hex, while an image with a lot of solid areas will make a smaller cart with PX9. So people might need to pick and choose when it comes to string encoding. Memory encoding will always benefit from PX9 though.

Cart #suyanuroge-2 | 2020-05-30 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
2


Thanks a bunch! Gonna try this out now.


Found a typo in px9_decomp, line 272

cache|=(((ord(sub(str,i,i))) or 0)&127)>>>16-cache_bits
should read
cache|=(((ord(sub(src,i,i))) or 0)&127)>>>16-cache_bits

With that change, this works awesome!

Out of curiosity, the output string is mostly symbols and kanji. Is there a reason to use this for encoding data as opposed to ASCII? (eg:"◝◝◝♥◝◝◝♥ョ⧗🅾️◝★スすクュ..." vs "sdf53RG4rads#$T%$..." )


Well, for my particular target image (ordered 8x8 dither with no solid areas) this doesn't look like a good compression fit when I look at compressed code size after storing as a string in the code section.
(It does take up less space in the gfx section of the file though, just not in the code section, if that makes sense.)

                              Total    Delta (compressed code)
Initial compressed code size = 9122    0
String compressed code =      15045    5923
Hex compressed code    =      15286    6164
Raw (uncompressed hex) =      12901    3779

So for this particular image/use case, I'd be better off pasting the image as a raw hex string.

Here's the image:


1

@electricgryphon

Wait, how the heck is it even working with the wrong variable name there?

...

OH!

Okay, I see, the test code actually, by complete coincidence, used a global called "str" to hold the compressed data, which is what "src" points to inside decomp. So it works, but fails if I change the global to "image". What a devious little bug! It was in sdecomp as well, sigh.

Thanks for catching that! I've fixed, tested, and updated my version of the cart. :)


I used chars 128-255 instead of alphanumerics because it meant I could use the whole continuous range, storing the value raw with its top bit set, so I could just clear the top bit during decode, no worrying about which chars are legal in strings (e.g. single/double quote, backslash) and closing gaps or table lookups and stuff like that. It looks funky but with the current cart format there's no compressed-size hazard in using the upper range anymore.


Also, I guess I'm not surprised that your image fares better as a raw hex string. The .p8.png compressor is much better than it used to be while PX9 seems to do well mainly with large continuous blocks of color. It might only matter if the hex-string version of your raw data is too big for the source code window or something.

Maybe try your method with other bases than 16? Like base64 or my version of base128, for instance. Encode/decode is pretty simple. I'm curious whether or not changing the bit stride like that will mask patterns out and make the .p8.png cart compression worse.


@electricgryphon

Actually, I tried the test I suggested above. Base64 and base128 both cause the compressed size to increase, so yes, they do obscure the patterns the .p8.png compressor is able to make good use of.

So yeah, I agree: if you can afford the space in your 64k source code window, a simple hex encoding that gets compressed by the built-in source code compressor is probably the best bet for your kind of image.

I'm not sure if this will be true for all image types, though. Yours has a lot of regular noise to simulate RGB, but most games use more cartoony visuals, like zep's sample image, whose px9 string performs markedly better after p8.png compression than a raw hex string version does. If you or anyone else has some real-world game image samples for me to try, let me know and I'll run them through this little base16/64/128 test bench I wrote to see, because if the string encoding isn't worth it, I should just remove that and de-complicate the library.

(Side note: interestingly, if I encode almost 32k worth of graphic data similar to yours into an almost-64k source hex string, it compresses to just about the compressed p8.png size limit. Nice work, @zep!)


Thanks for this utility!

I've been playing around with it as a way to compress smaller images (e.g. portraits for an RPG), and may have found a bug-- it seems if you use px9_decomp to a location on the screen with x0>0, you get a distorted image, caused by decomp incrementing y whenever screen x position>width (fine if x0=0, but if x0>0 this happens mid-image). Example:

Cart #kesirazudi-0 | 2021-01-03 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA

I implemented a fix to the main px9_decomp as px9_decomp2 in that cart that adds 6 tokens, but there may be a more token-efficient way, and I didn't update all of the functions-- I wanted to share quickly and see if this makes sense or if I'm missing something.


2

Thanks @icegoat
The following code kicking around at the end of px9_decomp looks like it was from an early WIP, is completely meaningless, and otherwise redundant. Ridiculous! So I removed it for v5, and as a bonus we're now down to 262 tokens \o/

-- advance
x+=1
y+=x\w
x%=w

Nice work!

I think I saved 4 tokens on the vlist_val methods, thus bringing them px9_decomp and px9_comp down to 258 and 430 tokens respectively:

  local function vlist_val(l, val)
    -- find position
    local v=l[1]
    for i=1,#l do
      l[i],v=v,l[i]
      if v==val then
        l[1]=val
        return i
      end
    end
  end

Very nice @p01! Updated for v6


1

(Edit: there was a caveat, see the next post for details and a balanced mitigation.)

WHO WANTS TO SAVE 11 MORE TOKENS?

I THINK YOU DO!

Cart #wenitedopi-0 | 2021-02-01 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
1

Details:

Took advantage of recent API improvements to smallen vlist_val() even more, using add(l,deli(l,i),1) rather than shifting the MRU entry using direct list manipulation loops. Also removed unused return value. Now 247 tokens.


2

Ah. Well. Darnit.

Houston, we have a problem.

I figured I ought to make sure I wasn't slowing down the algorithm by calling out to functions, so I added timing code to the demo (included in the cart below for future use & comparisons), and my concerns were justified.

Turns out add() and deli() don't have enough of an internal-loop advantage over interpreted manual manipulation to keep up, performance-wise. They dropped decompression speed by half, which is a hefty price for a handful of tokens.

The cart below has yet another implementation of vlist_val(), back to using manual manipulation again, but a better par in code golf. Sadly this saves only 4 tokens over Mathieu's previous record (258→254 tokens), not 11, but on the upside, it outperforms all previous implementations that I've checked.

I guess it's up to the community to choose: size vs. speed

I included both versions of the function, with the smaller-but-much-slower one commented out by default:

Cart #nasesaweko-0 | 2021-02-01 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
2


1

Thanks @Felice! I went with 7b for the head cart, as it's more likely someone will come back to this thread looking for tokens than for speed.


Oh, and from 0.34 seconds -> 0.27 seconds -- smokin!


1

Can there be a visual diagram of how this algorithm works? I'm struggling to understand your text description of the algorithm. The code looks too complex for me to understand as well.


This could be good for making videos in PICO-8.


1

Unsure if anyone would find this useful, I am using decompression at 0x4300 in screen format so I can then memcopy to screen... the funcs I inject to PX9_DECOMP are the following:

pxget=function(x,y)
  local addr=0x4300+x\2+y*32
  local v=(x%2==0 and band(@addr,0x0f) or shr(band(@addr,0xf0),4))
  return v		
end

pxset=function(x,y,v) 
  local addr=0x4300+x\2+y*32
  local lv=band(@addr,x%2==0 and 0xf0 or 0x0f)
  lv=lv+shl(v,x%2==0 and 0 or 4)
  poke(addr,lv)
end

-- copy to screen buffer per line
screendump=function(w,h)
  for y=0,h-1 do
    memcpy(0x6000+y*64,0x4300+y*w\2,w\2)
  end
end

The screendump function probably can be reasonably easy to update to support x,y positioning if not dumping the full screen though odd x positions could be tricky to handle (maybe initial pset+memcopy+trailing pset?). I needed to move elements on top a full-screen image and decompressing every frame is not fast enough.

Oh! importnat... 0x4300 has only up to 0x5600 available space so a full screen cannot be placed there... I was using this for MODE3 (64x64) screens!

Hope this helps you!


@zep I saved 13 tokens in the vlist initialization code by changing this:

-- create vlist if needed
local l=pr[a]
if not l then
    l={}
    for e in all(el) do
        add(l,e)
    end
    pr[a]=l
end

to this:

-- create vlist if needed
local l=pr[a] or {unpack(el)}
pr[a]=l

This saves 13 tokens, but increases runtime on the example image by 2% (0.1491s -> 0.1528s)
If you add 5 tokens back and do this instead, the time goes down a tiny bit: (0.1491s -> 0.1489s)

-- create vlist if needed
local l=pr[a]
if not l then
    l={unpack(el)}
    pr[a]=l
end

(I haven't done more rigorous testing than this, and the 2% slowdown seems acceptable to me, but hopefully this info is helpful if someone cares to look into it more.)

Here's the updated cart (it also includes a changelog entry + a few whitespace changes + a typo fix: "px8"->"px9")

Cart #kebuhimoyo-0 | 2021-10-14 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA


Alrighty, I've saved another 7 tokens by completely redoing the order of the bitstream. px9_decomp now runs about 2% slower on the example image (likely because I'm reading data one byte at a time instead of 2 bytes at a time) (.1528s -> .1557s)

I'm not sure how far is too far, in terms of token/performance tradeoff. But anyway, here's the cart (234 tokens to decompress)

Cart #kuhobepase-0 | 2021-10-15 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA


2

Thanks @pancelor, amazing improvements!
(I just realised now the head cart was not up to date, and copied kuhobepase-0 to px9-9)

Interesting to see @carlc27843 was able to get 10% better compression by replacing the span length/index encoding with arithmetic encoding used in Nebulus:
https://carlc27843.itch.io/nebulus

Using arithmetic encoding would of course cost more tokens in general, but was cheaper in that case because he already needed it for other parts of the program.


4

Here's a demo of arithmetic coding in a px9-like decompressor.

  • compressed size is 1119 bytes (px9 was 1275; so this is 156 bytes or 12% smaller)
  • total tokens is 348 (px9 was 234), however my game uses vlval, unpr and fpaq1d functions in other places, so the image-specific function tokens is 113
  • decompress time is 0.5754 seconds (px9 is 0.1491 seconds; so this is nearly 4x slower!)

It's tradeoffs all the way down... I'd guess few games need the extra space so the official px9 is most likely the best choice for most use cases.

(This particular implementation has other restrictions, like width/height must be <=256, and compressed data must be stored in ram).

Cart #px9fpaq1b-0 | 2022-04-02 | Code ▽ | Embed ▽ | No License
4


3

Been using my own string-based compression systems for a while, but to store enough sprite data to make large games feasible, I need a codec that can compress to memory as well, and PX9 is the best solution I've found for that.

With that in mind, here's my contribution. It's a new variant included alongside Pancelor's 234 token version, and for 254 tokens it can decompress data from either memory or full 8-bit strings. Performance is pretty close, ~0.39% slower when decompressing from memory, and ~1% slower when decompressing from strings. The strings are automatically output to the clipboard when compressing, just be sure to press ctrl+p before pasting them.

Cart #janeriruki-4 | 2022-04-17 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
3


Can I have some help with this? I want to make a library of images for something and need a way to store them and this seems like the best option, can someone explain how to do it?


Cart #bifwomafe-0 | 2022-06-19 | Code ▽ | Embed ▽ | No License

I think I found an edge-case where the compression routine fails. I fell across it while working on silly splash screens. After some playing around, the failure mode can be "corrected" by judicious application of certain background colors (EDIT: See update in this same post). I've added a (slightly) modified test-cart to replicate the issue. If nothing else posting this for others to see if they encounter the same issue. Here is the de/compressed result:

UPDATE: Further experimentation suggests that if the source image doesn't have enough entropy, the compression algo becomes confused. If you add some colored circles to this test cart, then the algo snaps back into perfect performance. The example image below will de/compress perfectly.

UPDATE 2: I thought the text was perhaps causing the issue originally, with the subtle interpolated colors around the letter outline, but plain color fields will elicit the same behavior. This is in-line with the "not enough entropy" line of thought, but deleting the text to generate the same results provides a much easier test-case.


2

good catch @djclemmer! I introduced this bug (oops) in #px9-9 when I made the decompressor read single bytes instead of bytes pairs (@src instead of %src). This causes problems when there are >255 predictable pixels in a row, just like you said: "if the source image doesn't have enough entropy"

I fixed it, and also saved another 19 (!) tokens in the decompressor, along with a slight speed-up. There were a few unused variables sitting around that I removed, but the savings mainly came from changing getval() from this:

local cache,cache_bits=0,0
function getval(bits)
  if cache_bits<8 then
    cache_bits+=8
    cache+=@src>>cache_bits
    src+=1
  end
  cache<<=bits
  local val=cache&0xffff
  cache^^=val cache_bits-=bits
  return val
end

to this 3-line version:

-- read an m-bit num from src
local function getval(m)
  -- $src: 4 bytes at flr(src)
  -- >>src%1*8: sub-byte pos
  -- <<32-m: zero high bits
  -- >>>16-m: shift to int
  local res=$src >> src%1*8 << 32-m >>> 16-m
  src+=m>>3 --m/8
  return res
end

(I also reversed the order the bits are pushed during compression)

I didn't include @JadeLombax's string code; you should be able to replace $src with (ord(datastring,src)>>16|ord(datastring,src+1)>>8) but it seems a bit token-heavy... plus I think you need src+2 as well to avoid the >255 bug (in this case it might be a >511 bug)

215 tokens: (for the decompressor)

Cart #kuhobepase-1 | 2022-07-27 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
2

edit: oh actually, something like this should work to decode from strings:

local function getval(m)
  poke(0x8000,ord(thestring,src,4))
  local res=$0x8000 >> src%1*8 << 32-m >>> 16-m
  src+=m>>3 --m/8
  return res
end

1

@pancelor,

Nice optimization there! I'll see if I can get a version working with strings, but if not I guess I'll just stick with the higher-token version. I'm adapting PX9 for storing multiple spritesheets in a single cart along with a large compressed world map, so I need to have the string capability, and the images I'm storing should have plenty of entropy to avoid the bug djclemmer pointed out.


@JadeLombax, I am using the older v10 string compression routine (not the new v10 which fixes the low-entropy bug but I think removes string support). My typical workflow is to setup a utility cart with PX9 code loaded in, draw up my assets and then run px9_comp on the assets, writing the string out to a file using printh. As a debug test one time, I immediately ran px9_decomp on a string I had just px9_comp'd, and it resulted in garbage. But the string written to the file, when copy/pasted into my other working cart, worked fine. I have been through this to make sure I'm not making some stupid mistake - Is it your understanding that I should be able to immediately do a px9_decomp(pget, pset) using the string generated from px9_comp(sget) ?


Page:

[Please log in to post a comment]