I've been thinking about ways to execute fast pixel effects on Picotron, and so I'm looking for ways to perform the effect entirely in userdata-land, and display the results as sprites. I've tried a few approaches, none of which seems to be quite ideal.
-
Do math on
f64
userdata, then render that to the screen. This gets me fast math operations, but of course the rendering result is completely garbled. I suppose some pack-ints-into-floats type nonsense could be attempted? But with just basic arithmetic ops this seems like a bad time. -
Do math on
f64
userdata, then cast that tou8
in a loop. This is a lot of time spend calling getters/setters in a loop, and may not be any better thanpset
, not sure. - Do math on integer userdata, then cast to
u8
using theuserdata:convert()
method and display that. This would work fine if I could figure out how to get fast fixed-point going. Unfortunately, div and mod are slow, and shifts don't seem to work.
I think my preferred resolution would be to allow convert()
to go from f64
to u8
, but anything that allows any kind of math with fractional quantities, either floating-point or fixed-point, would be great. Drawing f64
userdata with spr
with implicit casting would be great too. Just trying to avoid explicit loops to get to u8
....
Or perhaps there's some other way to get what I want with what's in Picotron already and I've just missed it. Not quite sure what's doable with the 10-15ish ops/pixel this would get best case but I'm sure something neat can happen.
Other requests:
-
I would love to have min/max/abs or similar "sharp" functions available for userdata, to allow masking off out-of-range data or other similar conditional-ish use cases.
-
trig would be amazing but I assume that's probably off-limits, or at least would be quite expensive.
- Even more amazing would be the ability to index userdata with userdata (and at this point built-in trig is probably not necessary)
edit: Why did I think spr() was necessary for the below? get_display():copy(pixels,true)
works fine in place of spr()
and this all runs at 60fps with CPU usage at 0.813 now.
I wrote a full-screen tunnel effect using a 480x270 userdata. Unfortunately, the effect only runs at 20fps. But the timing, normalizing all stat(1)
values to 60fps, is:
normalizing all cpu usages to 60fps, i get:
- full effect: 2.778
- everything except spr(): 0.770
- no spr(), no u8 conversion: 0.538
- no spr(), u8 conversion, or effect: 0.016
The graphics docs I think had a spr()
design goal of at least 6 times a frame at 60fps. I'm curious if this is a bug, a spec change, a TODO?
Also - looks like strided copies are quite slow, perhaps ~3x the cost of non-strided copies? Or maybe I am doing something wrong.
[Please log in to post a comment]