I have started measuring function costs precisely, because I like accurate things. It’s all on the wiki but not fully complete.
Here are a few funny things I already learned:
- <code>x^.5</code> costs 16 cycles, whereas <code>sqrt(x)</code> costs 27
- <code>x^4</code> costs 8 cycles, but <code>x*x*x*x</code> only costs 3
Some of these, such as clipped <code>circ()</code>, are pretty tricky to measure, I hope someone can help!
Edit: removed claim about shl() because that function behaves a bit differently.
Out of interest how are you measuring these things in the first place? Some sort of sampling profiler?
I just use stat(1) and stat(2) and call the code 1024 times:
n = 1024 -- calibrate flip() x,t=stat(1),stat(2) for i=1,n do end y,u=stat(1),stat(2) -- measure sqrt(i) for i=1,n do sqrt(i) end function c(t0,t1,t2) return(t0+t2-2*t1)*128/n*256/60*256 end z,v=stat(1),stat(2) print("lua cycles: "..c(x-t,y-u,z-v)) print("system cycles: "..c(t,u,v)) |
This prints the cycle counts for sqrt(i):
lua cycles: 3 system cycles: 24 |
Thanks - didn't know about stat(2).
Perhaps if I knew anything about Lua internals I'd be nodding sagely at this point, but I guess I've got some reading up to do..
I've done a bunch of this.
Take care that there are periodic interrupts, possibly for audio, so if you time long code, it can return unexpected results.
Also note that your variable type will affect the timing, e.g. sqrt(i) will be different if 'i' is local or global.
this snippet probably saved my game!
is the '60' for 60 fps? should I change it for a 30 fps game?
function c(t0,t1,t2) return(t0+t2-2*t1)*128/n*256/60*256 end |
btw, I get a minus value with this sometimes, any ideas what I'm doing wrong?
I've expanded on this snippet, and written out an explanation of what every term in that calculation is doing: https://www.lexaloffle.com/bbs/?tid=46117
(tl;dr: 128*256*256
comes from pico-8's speed (8MHz), and the 60 comes from 60fps)
[Please log in to post a comment]