I've finally managed to get a clean, 3d wireframe engine up and running. It does backface culling but otherwise is as plain as it comes.
Working through the maths and pulling together from various sources was a pain as writing a software renderer is a bit of a lost art form with D3d/OpenGl doing all the maths for you these days.
So I thought people might appreciate a simple example to start from.
THis doesn't do any camera transforms - the camera is at 0,0 starting down the z-axis.
Oh, use the arrow keys and zx to rotate the cube on the 3 axis and s/f to move the cube along the z axis.
I don't have clipping planes implemented yet so you can move the cube behind the camera and have it go crazy
I have one big optimisation left on the table. I'm recomputing the polygon normals from scratch every single frame rather than making them part of the model data and transforming them with the rest of the model during the world transform.
Now I just need some model data for spaceships from Elite and I'll be good to go!
If you're just doing backface culling with the normals, you can replace that completely by just checking if you are drawing the triangle in the wrong order (clockwise vs. counter-clockwise).
I was planning on experimenting with flat shading as my next step so I'll keep them in for now - but if I abandon that I'll remember your tip
Found a source of Elite spaceships |
I never knew how badly I wanted a PICO-8 demake of Elite until right this second. Oh, man. Oh, man.
Nice work on the wireframe rendering code.
Right, my next step is to tidy up the code to create a generic concept of a mesh and a model, a mesh being a descrition and a model being a game world object.
Then once I can have multiple models in the world I either have to do full on z-buffer or simply draw models in z-order (I will be going with option number 2!)
Then I extend the code to allow every polygon in a model to have it's own colour/colour gradient (so the engine exhausts on the elite models can be red for instance) and then I might call it a day on this.
Oh, I suppose I should do the camera transform as well before I can say it is complete
If you want to see me play around with this then follow me on Twitter @twitonatrain to see hilarious bugs and the like
Zomg - I've just realised that my projection matrix is transposed. This explains every subtle (and really not subtle at all) bugs that I have!
The Big Mega Massive Update
Codes commented, has a positionable camera, lovely dithered shading a choice of solid, wireframe or vertex rendering.
Input 1 rotates a tree, input 2 rotates the camera.
Hopefully this code should be relatively easy to understand.
Cool! The dithering and shading looks nice. Were you able to get it up to the benchmark you wanted?
Do you have plans for a game using it, or just making the rendering engine?
Very cool! If you're looking to optimize your triangle fill function, check out this thread: https://www.lexaloffle.com/bbs/?tid=28317
I'm planning on doing a walking simulator with it, taking a walk through the woods.
Regarding speed ups (which the engine still needs) I'm exploring two avenues. My dithering code precludes me from getting the cheap and easy and non intuitive rectfill speed up and I think a full z buffer (so per pixel setting) would be too slow so I'm thinking about scanline rendering with an active e edge table. I'm also investigating options for speeding up drawing static geometry (going to work out the code complexity/size of BSP trees).
Ah ha, just spotted a dumb arse inefficiency in my code, I'm constantly recalculalting the gradients for the lines in my scan line function. I can just calculate them once and pass them. Right, lets see how much faster the code is once I pull that out.
Experimenting with different triangle fill functions. Given that I am sticking with my dither I am limited in how much I can optomise but it is absolutely insane that switching from LINE to RECTFILL speeds up my engine from 1800 triangles a second to 4050 triangles a second.
Shame I'm stuck with MEMSET based methods.
just in case you missed it, there's this thing you should know about rectfill:
https://www.lexaloffle.com/bbs/?tid=28374
that might be fixed in next update (?)
Good to know. I'm sticking with MEMSET due to wanting dither so my preference would be for MEMSET to take 0 CPU and everything else to be massively increased CPU usage ;)
New triangle renderer knocks 0.13 off of CPU usage when rendering the reference cherry at the cost of occasional pixel gapping in triangle edges.
I deem this... acceptable.
Two questions:
1) Are you still doing arbitrary polys, as when you were doing line-based drawing? I notice your first filling demo had a cube made of triangles. Are you tessellating n-gons to triangles for rasterization? I keep meaning to try writing a renderer that does native n-gon rasterizing, assuming the n-gons have convex, coplanar boundaries. It's more logic for following edges down the screen but potentially saves a lot of per-tri and per-span overhead.
2) Could you optionally call rectfill() when your fill value is 11,22,...,FF? On average, half of your polys will be solid. If you start doing 1/4 and 3/4 dithering too, then you'd only need memset for the even lines of 1/4, both lines of 2/4, and the odd lines of 3/4. Similar savings but more gradient options.
1) I'm turning n sided polygons into triangle fans to draw, logically try remain n-goms right up until the polyfill call. Next on my list of things to learn is get my head round scan line rendereing and see if that is as fast. I am cheating as I'd don't depth sort the triangles before drawing, only the models, so concave models have 'hidden poly' over drawing like the stalk and leaf of the cherry.
2) That's a very good idea but I think given that an new pico version might radically change rectfills performance I will stick with 100℅ memset for now and wait and see .
The final though I've had is maybe sprite blitting is fast and I could do shading via blitting sprites? Anyone tried that?
I ran three bits of code that filled the first row of the screen through my benchmark. These are the cycle counts for each:
CYC CODE --- --------------------------------- 6 RECTFILL(0,0,127,0) 384 SSPR(0,0,128,1,0,0) 69 MEMSET(0x6000,0,64) |
So, uh, nope. And the non-scaling SPR() works in units of sprites, so that's also gonna take too long.
Hey, @Felice can you run that same benchmark for RECT() and LINE()?
Also, can you tell me how to get those figures for cycles in pico-8? I've just been using stat(1), but this seems more accurate than that.
Basically I have a for loop that's tuned to run for exactly one second without anything inside. Other testing showed that the overhead of the for loop is one cycle per iteration, so every second added to the run is one more cycle added to each iteration, and thus one more cycle for the code inside the loop.
My stuff's a little more complex than this, but this is the basics of what you need to test the cost of code:
local __t=time() for __i=0,0x40.afda,0x.0001 do <insert code here> end __t=time()-__t print("cycle count = "..flr(__t-0.5)) |
Oh, and line(0,0,127,0), which is the line()-equivalent of the other code I tested, comes in at 134 cycles.
Edit: Just saw you wanted rect(), not the rectfill() I already did. A 128x1 rect() comes in at 122 cycles. 128x128 comes in at 238.
That's a super handy test rig Felice. Thanks for that.
The rect and rectfill cycle counts are just so wacky.
Yeah they are. I think it's a programmer's old friend at fault: the off-by-one error. The API has both xyxy and xywh calls, and I imagine there are a few call costs calculated on the wrong assumption.
[Please log in to post a comment]