If we type "x = -6", the "-6" part requires two tokens, which I assume is because the version of Lua you use was still internally treating negative literals as positive literals with a unary-minus operation applied to them, e.g. "x = unm(6)".
However, using a negative literal in PICO-8 appears to have the same cycle cost as using a positive literal, as far as I can tell with my benchmarks. For instance, "x = y / -6" costs the same as "x = y / 6".
So basically we're being charged a token for an operation that doesn't actually seem to occur, and really shouldn't need to occur anyway.
Would it be possible to delete the token cost for unary-minus if it's being applied to a literal?
The alternative is that people in dire token straits will write "x = y / 0xfffa", which is just gross. :P
Fixed for 0.1.12d
'-' is now counted as part of a number at the tokenizer level, when the previous token (not counting comments or end-of-line) is one of:
( { [ , = * / + - ^ % < > <= >= == != ~= == not and or if elseif while until return |
This bug had cascading implications attached to it -- around 20~100 tokens are recovered for heavy carts, which puts more pressure on the code compression: the rough principle is that the token limit should normally be reached before the compression limit unless a lot of data is stuffed into the code section.
This is almost no longer true (especially with the new 0.1.12d character set that the original compressor is not optimized for) -- so an associated fix is that the compression in 0.1.12d is now slightly better and character-set agnostic. A hard decision to make as it competes with the tread-lightly-on-existing-optimisation-work principle, but man it feels good to get both of those thorns out of my side. I will have to figure out something like apologetic fruit baskets for authors who have posted 8192-token carts.
It's great to hear that the end result was an unexpected clearing of several issues. I don't think the devs who have previously sweated blood to fit their cart in 8192 tokens will hold it against you. ;) Thanks!
By the way, speaking of compression, I've been meaning to suggest something to you:
-
Run a filter over the corpus of carts on the BBS to extract the most common strings up to a certain reasonable length.
-
Delete any that are just coincidentally common now, but might be uncommon in the future. Basically you'd be looking for reliable PICO-8 and Lua language constructs, rather than ephemeral human language constructs, e.g. "function _init()\n\t" or "rectfill(", but not "2016".."2019" or "celeste".
-
Rank them by usage, use that to sort them least-to-most common, and concatenate them into one string.
- Seed the history for the compressor/decompressor with that string.
This would basically bootstrap the (de)compressor so that it could be fetching strings from history right out of the gate for most carts. For instance, I'd bet that you'll find the best candidate for the final string to be "-- ", since most carts start with a comment.
Possible tweaks:
-
Do some testing to see what the actual results would be on the BBS corpus, try rearranging, test again, compare, etc. You get the idea.
- Limit the "common strings" you filter from carts on the BBS to the first N bytes of the source code, since farther into the source, the bootstrap dictionary obviously has little useful effect due to the offsets becoming too large. That, or only take strings of increasing length as you get farther into the source.
This could boost compression ratios with only a tiny addition to the .exe size.
[Please log in to post a comment]