Aggressive Lua code minification and debug code stripping

huulong • 2020-02-10*2020-02-10 20:48* •

BBS>

PICO-8>Blog

Sometimes your cartridge fits in the token count, but not the characters count / compressed size, so you can't export it until you reduce the number of characters in the cartridge. You'd also like to keep comments, meaningful variable names and even debug code where you can, in case you're gonna continue working on the code.

One way to do this is to use a build pipeline:

copy your source file(s) to an intermediate directory
process file(s) to reduce code size
output final cartridge

If you use picotool or work with compiled languages, you should be familiar with that process. It may sound a bit overkill for PICO-8, but is very useful if you're stuck in the case mentioned above.

This is what I do when working with my custom framework pico-boots, but while I don't think many devs would be interested in using a complete framework for PICO-8 written by somebody else, they may be interested in the individual processing steps described below. You can always refer to pico-boots' repository for implementation details.

Note that I use picotool to build my cartridge from multiple sources, but the techniques I use apply to single files too. For those also using picotool, I'll explain for which Step the processing should be applied: on individual sources (pre-processing) or on the output cartridge (post-processing).

Content

Comment stripping
Minification
Debug code stripping
Multi-line
Single-line
Going further

Comment stripping

Probably the most obvious, you'll want to remove single line ("--") and block ("--[[ ]]") comments.

I used to do it manually; now the minification step does it for me, so I won't dwell on it. However, if you already use very short variable names but have lots of comments, it's worth trying comment stripping alone.

Single line comment stripping is easy to implement in a file processing script, block comment may be a bit trickier.

Step: pre-processing or post-processing

Minification

Code minification mainly consists in variable/key name shortening, space trimming and comment stripping. You can whatever does the job. However, note that minifiers are written for pure Lua and may not like PICO-8 shortcuts such as single-line

if(cond) effect

and

a += 5

So you may want to expand those into pure Lua statements, manually or inside your pipeline (don't worry though, minification will often more than make up for the loss of space).

If you use picotool, note that it generates a single-line "if(cond) effect" during the build. So you'll need to expand that in your pipeline post-processing, just before minification.

Personally, I use a custom branch in my fork of luamin. It's basically Luamin plus a few features/fixes:

option "--minify-level" for aggressive member minification: useful if you want to minify even attribute and method names (requires extra care!)
option "--newline-separator" for partial newline preservation for easier debugging: otherwise the code is a veeery long line and error messages just tell you "assert on line 1" (note that there is only a newline where luamin would put a ';' without the option, so clauses ending with brackets, like function definitions, will still chain into longer lines)
works outside the terminal (e.g. in Sublime Text)
(more info on minification and npm in pico-boots README)

I know that picotool has a --lua-minify option, but it was too aggressive for me (it minifies even "__call" which breaks metatable logic).

Step: pre-processing or post-processing (but if using aggressive minification, post-processing only so member names are minified the same way across the cartridge)

Example

function demo_app.instantiate_gamestates() -- override
  return {main_menu(), debug_demo(), input_demo(), render_demo()}
end

function demo_app.on_pre_start() -- override
end

function demo_app.on_post_start() -- override
  -- enable mouse devkit
  input:toggle_mouse(true)
  ui:set_cursor_sprite_data(visual_data.sprites.cursor)
end

function demo_app.on_reset() -- override
  ui:set_cursor_sprite_data(nil)
end

becomes

function n.o()return{i(),j(),k(),l()}end
function n.p()end
function n.q()g:r(true)h:s(m.t.u)end
function n.v()h:s(nil)end

Implementation example

There is not much to do since it's already in the Luamin package, which you can get using "npm install/update" (see package.json).

However, if you're working on an actual .p8 cartridge rather than pure Lua code, you may want to extract the lua section, minify it and reinject it back. minify.py does precisely that (and also expands single-line "if(cond) effect").

Possible improvement on luamin for PICO-8

Only use lower characters to generate minified identifiers (see IDENTIFIER_PARTS in luamin.js). Otherwise, since PICO-8 auto-lowers characters when opening code in the integrated editor, it may cause variable conflicts ("Ab" becomes "ab" which may be another variable in scope) or even prevent cartridge from running ("Do" becomes "do" which is a keyword).

Debug code stripping

Multi-line

You can use whatever you want to flag debug code and remove it for the release build. However, you need some way to tell your pipeline that you are making a debug vs a release build.

On my projects, I use generic symbol-based preprocessing similar to the one in C# (or C++), but very much simplified. You define a set of symbols for your current build config (e.g. debug config defines ["assert", "log"], release config defines nothing). Then you surround parts you don't want in some build configs with special markers:

-- will be stripped in release
#if log
printh("Player hit!")
printh("Remaining HP: "..player.hp)
#endif

During build, any code surrounded by undefined symbols is stripped.

Of course, if you don't need one symbol per debug feature like me, you can just define "debug" for the debug config and surround all your debug code with "#if debug" and "#endif".

Step: pre-processing or post-processing, but recommend pre-processing to make core build faster, as there would be less code to assemble

Example

#if assert
assert(damage > 0)
#endif

player.hp = player.hp - damage

#if log
printh("Player hit!")
printh("Remaining HP: "..player.hp)
#endif

becomes in release config:

player.hp = player.hp - damage

Single-line

For single-line, you can write a simple parser that strips line calling certain functions.

For instance, I strip all single-line "assert(...)" calls if the "assert" symbol is not defined. It means that in the example above, I don't even need the "#if assert" anymore. It's very convenient when you have many logs and assertions. But multi-line stripping is still useful for more complex behaviors.

Step: same as multi-line

Example

assert(damage > 0)

player.hp = player.hp - damage

log("Player hit!")
log("Remaining HP: "..player.hp)

becomes in release config:

player.hp = player.hp - damage

Implementation example

See both multi-line and single-line stripping in preprocess.py

It also contains a reversed stripping tag "--[[#pico8" and "--#pico8]]" that comments out the code until it is built into a cartridge; but that's only useful if you run unit tests in pure Lua.

Going further

If your game has a lot of text, string compression is your next step. You can make your own, or check out a tool like p8advent (also see post which mentions alternative).

Also, if you're interested in the full build pipeline I use, check out the generic build script and the demo project script using the former.

build tool

freds72 • 2020-02-10*2020-02-10 21:41*

Good write-up /pointers.
So far, that sed script did the job:

/--\[\[/,/\]\]/d
s/--[ ]*(.*)$//
s/^[ \t]*//
s/[ \t]*$//
/^$/d
p

(doesn’t work with multi-line strings - but I never use them 😬)

Felice • 2020-02-14 2020-02-14 21:27

> (it minifies even "__call" which breaks metatable logic).

Seems like picotool should be modified not to minify any name starting with two underscores. I doubt anyone uses such names and they're well-known to be reserved for lua internals.

dddaaannn • 2020-02-15 2020-02-15 08:31

Great write-up huulong! I'm a little behind on picotool feature requests (ahem) but I still welcome more: https://github.com/dansanderson/picotool/issues

I agree with Felice that it's probably prudent to preserve all symbols that begin with double-underscore, and there are some newer names that I still need to block. Issue filed: https://github.com/dansanderson/picotool/issues/65

picotool's luamin already tries to preserve newlines, and isn't supposed to minify onto a single line. Is there a bug, or are you referring to the JavaScript luamin tool? Here is where whitespace is collapsed: https://github.com/dansanderson/picotool/blob/master/pico8/lua/lua.py#L1048

I'm not sure I understand the suggestion about the minified names and uppercase characters. The currently implementation of picotool's luamin uses only lowercase letters for minified names, and avoids known reserved words. (See https://github.com/dansanderson/picotool/blob/master/pico8/lua/lua.py#L968.) Can you describe an example where this is failing?

Thanks again!

Felice • 2020-02-16 2020-02-16 01:58

@dddaaannn

Can't speak for huulong, but I'd wonder if you're supporting the presence of the smallcaps (ascii uppercase) range that zep now allows in from external editors.

huulong • 2020-02-27*2020-02-27 10:58*

@dddaaannn

All the extra options and suggested improvements are for luamin by mathiasbynens (written in Javascript for node), which I'm branching from, not the minify feature of picotool, which is written in Python like the rest of the tool. I couldn't test picotool's minify on my game because I needed to preserve meta-methods starting with "__", so I cannot say if there are other things I would have needed. But I think it's convenient if your code already preserves lines. Same thing for lowercase minified symbols.

It's much harder to preserve newlines with luamin JS, because after the tokenization step you have no idea what was separating tokens in the original code; that's also why I don't truly preserve newlines, I only put newlines where I would normally put a semicolon, which excludes things like function a()print("hello")end

@Felice

For luamin JS, if the original symbol contains uppercase characters that's fine, the minification table will map the symbols accordingly, e.g. "myValue" will always become "a" and "myvalue" will always become "b" without confusion. It's just that currently, it may also minify other symbols into "A" and "B" which will be confused after PICO-8 applies lowercasing when opening code inside the editor (mostly for live edit+debug or by accident, since you rarely want to actually work on minified code). Uppercase characters in strings are preserved, since strings are preserved.

[Please log in to post a comment]

About | Contact | Updates | Terms of Use | Picotron

Follow Lexaloffle:

Generated 2025-04-29 13:01:56 | 0.068s | Q:20

User:
Password: