Hello,
I have a feeling I don't fully understand vectors. I am thinking of the optimal way to handle the updating of a large number of entities for something like a Vampire Survivors game. Right now, in my testing, parallel arrays appears to have the lowest impact on CPU. 1000 mobs move towards the player and cpu is 0.601, metatables go up to 0.823
I tried vectors this morning and my cpu stat is 1.4.
Vectors are a bit new to me and I'm wondering if I going about this the right way.
function _init() mobs={} player={} player.v = vec(240, 130) for i=1, 1000 do local spawn_radius = max(470, 270) / 2 + 50 local player_x, player_y = player.v:get(0, 2) local spawn_x = player_x local spawn_y = player_y local angle = rnd(1) local x = spawn_x + spawn_radius * cos(angle) local y = spawn_y + spawn_radius * sin(angle) v = vec(x, y) add(mobs,{v=v,spd=rnd(0.5)}) end end function _update() for mob in all(mobs) do -- Calculate direction vector (from mob to player) local direction = vec(0, 0) -- Get mob and player positions local mob_x, mob_y = mob.v:get(0, 2) local player_x, player_y = player.v:get(0, 2) -- Create direction vector direction = vec(player_x - mob_x, player_y - mob_y) -- Normalize the direction (make it length 1) local mag = direction:magnitude() if mag > 0 then direction = direction:div(mag) mob.v = mob.v:add(direction:mul(mob.spd)) end end end function _draw() cls() for mob in all(mobs) do local x, y = mob.v:get(0, 2) spr(1, x, y) end print(stat(1), 0, 0, 8) end |



Parallel arrays are actually a performance optimization in many cases outside of Pico-8, but for different reasons than why they appear to be efficient in Pico-8.
The key to understanding CPU performance in Pico-8 is knowing what things cost in terms of cycles; see https://pico-8.fandom.com/wiki/CPU for a full explanation.
The takeaway is that everything you do costs cycles, so the less code you have, the faster it will be. The reason vectors are slow here is simply because you have added a lot of extra steps involved in creating vectors and calling their methods; parallel arrays are fast because there are fewer table creations, and you probably aren't using method calls.
Having said that, the main bottleneck in your code might be the vector normalization, as vector:magnitude() is probably using the sqrt() function, which costs a whopping 48 cycles.
There is a much faster way to get a normalized vector than dividing by magnitude though. I don't have Pico-8 to hand so can't test this but it should work.
-- assign global trig functions to local variables outside loop to save cycles (6 cycles) local atan2 = atan2 local cos = cos local sin = sin -- within loop, normalize (dx, dy) to (nx, ny) (13 cycles) local a = atan2(dx, dy) -- (5 cycles) local nx = cos(a) -- (4 cycles) local ny = sin(a) -- (4 cycles) |



Hi @supercurses
It is true that :magnitude is a little expensive, but it is much cheaper in Picotron than in PICO-8. The main danger of using vectors cpu-wise is getting data in and out of them, and creating new tiny 2x1 userdata's for every operation. Whenever possible, I'd recommend keeping operations in userdata form (instead of getting and setting components), and using e.g a:add(b,true) instead of a = a + b -- the true argument means the result is written to a instead of creating a new userdata object and garbage collecting the old one.
Here's another version of _update that removes temporary object creation by reusing a single local vector (direction) so that no new userdata objects are created. I also replaced the all(mobs) for loop to avoid the function call overhead. This one runs at around 35% at the start, and reaches ~62% once all the sprites are visible:
function _update() local direction = vec(0,0) for i=#mobs,1,-1 do -- backwards incase want to delete something local mob = mobs[i] player.v:sub(mob.v,direction) -- direction = player.v - mob.v direction:mul(mob.spd / direction:magnitude(),true) mob.v:add(direction,true) end end |
There is not a huge advantage in using actor.vec_xy over actor.x,actor.y to reduce CPU ~ I imagine their main use is nicer syntax. To increase performance for a large number of entities, I think the best bet would be to keep them in one large 2d userdata, but that only works for very simple logic / movement that might not apply here (e.g. like in /system/demos/pixeldust.p64).



update: 25% cpu at the start if replace the "for all(mobs)" in the draw function with "for i=1,#mobs ..." -- all() and foreach() are huge cpu hogs for large iterations!



Thanks @zep, after switching all my tests to for i=1, #mobs there is now only a marginal difference between parallel arrays (mob_x table, mob_y table) and metatables in terms of CPU, obviously there is in terms of memory.
I might have the terms wrong here
Parallel Arrays (SOA?) (mob_x table, mob_y table) - 0.6016 / 1904794
Array of Structures (add(mobs, {x=10, y=10}) - 0.6731 / 2041270
Closure-based OOP (function that returns update and draw functions) - 0.6016 / 2401290
Metatable-OOP - 0.6017 / 2041322
Will try user data approach next
Closure was created by Claude...
-- Closure-based approach function create_mob(x, y) -- These local variables become the object's private state local x = x local y = y local spd = 1 -- The table we return contains methods that "close over" the local variables return { update = function() local dx = player.x - x local dy = player.y - y local distance = abs(dx) + abs(dy) dx = dx / distance dy = dy / distance -- These functions can access and modify the local variables x = x + dx * spd y = y + dy * spd end, draw = function() spr(1, x, y) end, get_pos = function() return x, y end } end |



Hmm..maybe I don't have the right approach for userdata. CPU for this test is 0.6791 which is the highest.
Spawning:
mobs = userdata("f64", 5, mob_count) for i = 0, mob_count-1 do local spawn_radius = max(470, 270) / 2 + 50 local angle = rnd(1) local x = player.x + spawn_radius * cos(angle) local y = player.y + spawn_radius * sin(angle) mobs:set(0, i, x) -- Set x position mobs:set(1, i, y) -- Set y position mobs:set(2, i, 1) -- Set speed mobs:set(3, i, 0) -- DX mobs:set(4, i, 0) -- DY end |
updating:
for i=mob_count,1,-1 do local mob_x = mobs:get(0, i) local mob_y = mobs:get(1, i) local mob_speed = mobs:get(2, i) local dx = player.x - mob_x local dy = player.y - mob_y local distance = abs(dx) + abs(dy) set(mobs, 3, i, (dx / distance) * mob_speed) set(mobs, 4, i, (dy / distance) * mob_speed) end -- add dx, dy to x, y mobs:add(mobs, mobs, 3, 0, 2, 5, 5, mob_count) |
drawing:
for i=0, mob_count-1 do local x = mobs:get(0, i) local y = mobs:get(1, i) spr(1, x, y) end |
[Please log in to post a comment]