Okay, bear with me, because this is a pretty niche bug and also I can't post the actual code because it's spoilers for a demo I'm putting together.
I have code that basically works like this:
local a,b={},{} --a loop that populates a,b with 128 nums each ::_:: cls() --some code that involves nested looping through a,b to draw pixel-by-pixel on the screen flip() goto _ |
This is not a super uncommon design pattern for my tweetcarts. One thing I'd intended to do was encode the contents of a,b rather than the code that populates them. So I did, and the performance tanked.
Weird. Maybe there's a performance difference I'm not aware of?
What was odd is that I know a and b are local in both cases, and they take up the same storage as far as I can tell. I started poking around, and I tried this:
local a,b={},{} --a loop that populates a,b with 128 nums each local anew, bnew = {...my constants...}, {...my other constants...} ::_:: cls() --some code that involves nested looping through a,b to draw pixel-by-pixel on the screen --note that this code never references anew, bnew flip() goto _ |
Oddly, the performance still takes a hit. Even though anew,bnew are defined outside the loop. Even nil-ing them out doesn't reverse the performance hit.
Am I being silly, or is this behavior strange?
I noticed this only happens when using a toplevel goto. When using _draw() to achieve the above it will not exhibit the same behavior (as long as the array definition is outside _draw() so it makes sense of course).
Also, if the loop is just
local d={1,2,3,4 ....} ::_:: cls() flip() goto _ |
There is no measurable difference if nothing is done inside the loop besides the cls() and the flip(). Below is a working example (comment d out and run to see the difference). Gives 36 % vs. 42 % performance depending on if it's commented out or not if observed using the ctrl+P monitor.
local d={1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666} ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _``` |
If you exceed some sort of program code threshold, then performance takes a hit.
The code below has a perf of 70%.
- if you comment out any of the first three lines then perf becomes 47%
- if you remove an element from table d139 or table d1 then perf becomes 47%
- if you change 1==1 to 1 then perf becomes 47%
- if you comment out a=0 then perf becomes 47%
- if you add another element to table d139 or d1 then perf becomes 94%
- if you add another command before the while loop then perf becomes 94%
- if you change the third line from local d1={1} to d1={1} then perf becomes 94%
- if you uncomment b=0 then perf becomes 94%
cls() d139={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9} local d1={1} while 1==1 do a=0 -- b=0 for i=0,32767 do k=0 end flip() end
For some reason this performance hit doesn't occur if you comment out k=0
It doesn't matter if you use while loop or goto. You don't need to use a table to reproduce. You can replace the long table with 144 cls() calls or 72 e=0 calls and you will get the same bug.
Trying kometbomb's experiment:
local d={1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666} ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ |
yields 42%, but moving the array to the end as follows:
::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ local d={1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,666} |
reduces it back to 36%. That tells me that it's not quite as simple as the total program size (although maybe it's total parsed so far?)
Repeating some of kometbomb and rilden's experiments, I decided to put local d back up front and figure out how many items go in the table before the performance hit happens.
local d={ 1,2,3,4,5,6,7,8,9,10, 1,2,3,4,5,6,7,8,9,20, 1,2,3,4,5,6,7,8,9,30, 1,2,3,4,5,6,7,8,9,40, 1,2,3,4,5,6,7,8,9,50, 1,2,3,4,5,6,7,8,9,60, 1,2,3,4,5,6,7,8,9,70, 1,2,3,4,5,6,7,8,9,80, 1,2,3,4,5,6,7,8,9,90, 1,2,3,4,5,6,7,8,9,100, 1,2,3,4,5,6,7,8,9,110, 1,2,3,4,5,6,7,8,9,120, 1,2,3,4,5,6,7,8,9,130, 1,2,3,4,5,6,7,8,9,140, 1,2,3 } ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ |
At 143 items, the CPU is at 42%, as before. Oddly, when you cut it to 142, it goes down to 39%, and at 141, it goes to 36%, where it stays no matter how many more items you remove from d. Of course, I compared this to the following:
local d={} for i=1,143 do add(d,i) end ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ |
(Where as previously established, changing 143 to 1430 or 14300 had no performance impact)
I also tried to establish that the number of characters used to initialize d didn't mean anything. The following code with 141 items stayed at 36% (and exhibited the strange behavior at 142 and 143+) despite increasing the character count:
local d={ 100,200,300,400,500,600,700,800,900,1000, 100,200,300,400,500,600,700,800,900,2000, 100,200,300,400,500,600,700,800,900,3000, 100,200,300,400,500,600,700,800,900,4000, 100,200,300,400,500,600,700,800,900,5000, 100,200,300,400,500,600,700,800,900,6000, 100,200,300,400,500,600,700,800,900,7000, 100,200,300,400,500,600,700,800,900,8000, 100,200,300,400,500,600,700,800,900,9000, 100,200,300,400,500,600,700,800,900,10000, 100,200,300,400,500,600,700,800,900,11000, 100,200,300,400,500,600,700,800,900,12000, 100,200,300,400,500,600,700,800,900,13000, 100,200,300,400,500,600,700,800,900,14000, 1 } ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ |
I don't think it's a straightforward token count, either. Adding a simple a=0 to the front meant I had to cut the table down to 139 to get back to 36%, 140 for 39%, 141+ to get 42%. Adding b=0 to the front cut the table another two items to remain at 36%.
a=0 b=0 local d={1...137} ::_:: cls() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end flip() goto _ |
You don't even need to edit the program to elicit this bug.
If you copy the noise generation 'for' loop into a function, and then optionally call either the function or run the inline loop at runtime based on a button, the function code (with call overhead) runs at 0.36 vs. the slower(!) inline version at 0.42.
Run this code and try pressing any button:
local d={1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42,34,6661,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,4,5,6,7,78,8,5,5,5,4,65,4,6,45,3,4,3,42} function wtf() for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end end ::_:: cls() if btn()!=0 then wtf() else for y=0,31 do for x=0,127 do pset(x,y,rnd(16)) end end end flip() goto _ |
[Please log in to post a comment]