@zep seems like something in 0.2.0d (I don't know if it's present in earlier bugfixes of 0.2.0) is wonky with coroutines not updating sometimes or something?? different unpredictable problems are actually happening almost every time I run it; check this out (the dialogue is updated in a coroutine):
so far it seems like most of the time it seems to lead to crashing because variables that are declared inside coroutines are attempted to be referenced by code in the main thread but the variable hasn't been defined yet, which seems to also point to the culprit being coroutines mysteriously not updating every frame like they should (the _update60 method in this cart calls coresume on both of the coroutines every frame; there is one for the dialogue and one for controlling the presentation of the "pins" in each level--both of those are the things that seem to be breaking)
EDIT: okay I've done a bit more testing and there is definitely an issue where a coroutine just starts updating suuuper slowly (seems likely the same issue as is visible in the GIF above) and basically yields in the middle of itself where I don't have any yield statement. For reference, here is some code inside a coroutine where I added debug printh statements:
repeat printh('offset.y:' .. offset.y) printh('vy:' .. vy) if offset.y >= maxtargetoffset then vy = -abs(vy) end printh('one') if targetzipy then if vy < 0 and offset.y < targetzipy then vy -= .1 end elseif offset.y <= 0 then vy = abs(vy) end printh('two') offset.y += vy printh('three') for _, t in pairs(targets) do if not t.isknocked then t.y += vy end end printh('four') while playercount == 0 do yield() end yield() until state ~= state_play or all_offscreen(targets) |
and here is the console output:
vy:0.09 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 one done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 two done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines resuming coroutine 1 resuming coroutine 2 done updating coroutines |
you can see it's yielding all by itself in the middle of those lines for some reason?? (i.e. the "offset.y:whatever", "vy:0.09", "one", "two", "three", "four" should all be next to each other in the console but they are interrupted by several frames)
YES!! I've been seeing EXACTLY the same issue!
(Been trying to diagnose it for the past day or so).
I'm using coroutines a LOT in UnDUNE II.
I was finding that was CPU goes through the roof after a few seconds.
Turns out, things like pathfinding (coroutines are used for mapping paths - so they can be yielded across frames to keep FPS high) starts fast as expected - but then just CRAWLS to an almost halt after a few seconds, resulting in HUGE CPU spikes.
(The reason it may take a few seconds is for more complicated paths - would be much faster if I didn't yield, but game would stutter awfully)
Is this the intended "Final cpu adjustments", @zep?
(I'm really hoping not!! 😱)
I would hope this is just a bug, because this would absolutely break how coroutines are supposed to work. They're supposed to be synchronous, by definition. Otherwise they'd just be threads.
I was able to recreate this as well:
Same as what others have seen, the coroutine functions normally and then suddenly freezes:
...in this case, between frames 26 and 27 of this gif:
It looks like at some point, the CPU usage of the couroutine gets 1 added to it (maybe a dropped frame?) and from then on the CPU use of the coroutine is calculated as if it starts at 100% ... and therefore the coroutine is halted immediately (or possibly after a single instruction) every single frame.
Thanks @kittenm4ster, I found the problem. It was the worst kind of bug: caused by a last-minute fix for another, unimportant bug. I've updated the web players (you should see 0.2.0d3 on boot), and 0.2.0e binaries will be out before too long. The problem was caused by (as these example snippets show) coroutines running out of cpu cycles before the rest of the frame, and being force-yielded prematurely. I'm still working on the way coroutines interact with the virtual CPU limitations, but it shouldn't mean any change in program behaviour if everything is working correctly (the cpu usage reporting is still flaky, which can interfere with framerate regulation etc.)
why having this ‘yield early’ logic for coroutine? It is not as if they were actually running in parallel - I don’t understand how a cpu heavy coroutine could be exploited (eg should be treated as any other function)
Oh thank goodness for that (that it's a bug - figured it HAD to be).
I couldn't see how this would be exploitable as it's synchronous (as @Felice says) - I just find it very useful to spread out heavy computation over many frames to avoid CPU spikes).
Anyway - on the plus side - this did give me an opportunity to complete @kittenm4ster's excellent "Alfonso's Bowling Challenge"!
(I played a couple of rounds when it was released - but didn't realise the game had THIS much depth)
Great game - well done! 😉👍
@freds72 from what I understand it’s not really that the coroutine is explicitly force-yielded, it’s that the user code as a whole is force-yielded, and it happens to be executing a coroutine at that moment.
As an aside: in Lua 5.2, the most straightforward way to implement CPU limits is to wrap the user code inside a coroutine and install a Lua debug hook that runs every X instructions and calls yield() when the virtual CPU is exhausted. I’m pretty sure this is what PICO-8 does, because 1) you can call yield() from e.g. _update() without getting an error, and 2) when you do so, it causes the framerate to drop to 15fps.
So, unfortunately there is no direct way for the instruction hook to yield the “main” coroutine, only the “current” running frame can be yielded. So PICO-8 probably has additional glue code in coresume() that yields recursively up to the top of the call stack and I’m guessing that this is what broke in the bug being discussed here. (Sorry for the long blabbering but I’ve been wondering how to implement this properly in my emulator and writing about this problem helped me understand it better!)
@samhocevar thanks for the thorough explanation - makes more sense from this angle
Sam's description is exactly right -- although more recently I've been moving to a single callback hook per frame, which is a little more dangerous when it goes wrong (dropping a whole frame of execution instead of e.g. 1024 instructions), but in general is proving much cleaner and easier to reason about.
> So, unfortunately there is no direct way for the instruction hook to yield the “main” coroutine, only the “current” running frame can be yielded.
I struggled with this for a long time, and as a result PICO-8 has never handled CPU limiting well with coroutines. It hasn't been a huge problem yet because normally coroutines are used for update logic that does't spend much cpu. Anyway, just last week I found a nice solution which is a little loopy, but is working well so far and should make it into 0.2.0e.
Superyielding
To get back to the bottom of the Lua callstack (a "superyield"), set a flag that causes 1. coresume() to yield once more after returning (by wrapping it) and 2. the debug callback to be called immediately on re-entering the vm mainloop, before executing any additional instructions. This way, each coroutine on the callstack will yield without running anything except the extra yield call, and the debug callback will in turn trigger and yield the next coroutine down, until the bottom of the Lua callstack is reached and the original lua_resume() call returns.
- coresume() wrapped so that when when yielded by a superyield, it also yields itself once more:
function coresume(c,...) local res=_coresume(c,...) while (_super_yielding() and costatus(c) == "suspended") do yield() -- during superyield res=_coresume(c,...) -- when recovering at start of next frame end return res end |
- The debug hook being used both to trigger the superyield, and to immediately yield every time the Lua vm is re-entered (l->hookcount stays at 0).
void cpu_limit_reached(lua_State* l, lua_Debug *ar) // debug hook { if (super_yielding) { // on our way down the stack. just yield. lua_yield(l, 0); return; } // this point is reached once at the end of each frame, when no more cpu is available. // add cycles spent by the lua vm to a running total here super_yielding = 1; // will be reset to 0 when lua_resume()'ing the next frame. lua_yield(l, 0); } |
p.s. excuse the off-topic blabbering -- likewise, to help myself understand it better.
It seems my cart still has a bug related to coroutines not fixed in 0.2.0d3 (click on the clock in the first room, you'll see what I mean). It's probably related to the above quote, "PICO-8 has never handled CPU limiting well with coroutines." I sort of went ham with them (but only where the main logic was bare-bones or not used).
Anyway, if the newer pico8 can't handle coroutenes as I used them, I'll probably just clean it up. The clock rendering was kind of sloppily done, but I can see it breaking other older carts.
Edit: It has been fixed in 0.2.0e!
@zep thanks for the insightful explanation! But that made me think: isn’t there a problem with using l->hookcount instead of a shared reference to a counter inherited by new Lua threads? If the user creates a function f that uses at most 1023 Lua instructions and replaces calls to f(...) with coresume(cocreate(f),...) the debug hook will never be called and they will get tons of free CPU… see for instance the following program, which uses only 1% CPU on 0.2.0d:
local a,b=0xdead.beef,0xbead.29ba function r(n) b=a+b<<>16 a=a+b return a%n end function f(x) for x=0,100 do pset(r(128),r(128),r(16)) end end function _update() for x=0,100 do coresume(cocreate(f),x) -- instead of f(x) end end |
(sorry for spoiling another good exploit 😄)
Posting here to be able to find this thread later.
Coroutines automatically yielding has been discussed a few times on the discord server but I wasn’t able to find a reference.
Also I’ve had buggy coroutines that made the whole game hang (before I added useful error reporting with 'printh(exc)' + 'stop(trace(coro,exc))'), but if I understand correctly a non-buggy coroutine can’t make the process hang.
[Please log in to post a comment]