- Platform: Raspberry PI 3, Linux
- Languages: C, ARM Assembly
- Duration: 1 month
- Project: Two programmers working on the implementation and optimization of the voxel terrain graphic algorithm.
About this project
This is the final project of the low level module at ESAT Valencia. It is focused on the implementation of a graphic algorithm with the maximum optimizations possible and profiling every change made.
In the assignment we had to choose which algorithm to implement and optimize. We chase the voxel terrain due to its complexity and similarities to an actual game engine.
A voxel terrain has two key functions. First of all, it is needed to sample heights from a height map (procedurally generated or from a texture) and then draw height per height on the screen.
The main features for optimization are:
- Occlusion culling
- Fixed-point mips levels
- Lecture and writing reordered to work in sequence
- Sampling moved to ARM assembly (for an optimized register usage)
/* These are some time measurements we did during profiling and optimization. sampleMap is the function to compute heights for the terrain. DrawTerrain draws the terrain in the screen pixel to pixel. 1 - No optimization and DrawTerrain as columns (no drawing rotation): sampleMap: 17.378000 ms, 10426800 cycles, 63.640137 cycles/iteration DrawTerrain: 8.459000 ms, 5075400 cycles, 30.977783 cycles/iteration 2 - Sampling in horizontal sampleMap: 14.132000 ms, 8479200 cycles, 51.752930 cycles/iteration DrawTerrain: 6.854000 ms, 4112400 cycles, 25.100098 cycles/iteration 3 - Precomputing of projection / z sampleMap: 13.211000 ms, 7926600 cycles, 48.380127 cycles/iteration DrawTerrain: 5.247000 ms, 3148200 cycles, 19.215088 cycles/iteration 4 - Sampling heights as fixed point variables: sampleMap: 10.617000 ms, 6370200 cycles, 38.880615 cycles/iteration DrawTerrain: 4.492000 ms, 2695200 cycles, 16.450195 cycles/iteration 5 - SampleMap using fixed point variables sampleMap: 6.525000 ms, 3915000 cycles, 23.895264 cycles/iteration DrawTerrain: 6.089000 ms, 3653400 cycles, 22.298584 cycles/iteration 6 - DrawTerrain Horizontal (texture changed from 512 to 1024 and zFAR from 32 to 1024) sampleMap: 30.009001 ms, 18005400 cycles, 27.474060 cycles/iteration DrawTerrain: 7.962000 ms, 4777200 cycles, 7.289429 cycles/iteration 7 - LODs (Adding of MIP levels for optimized far distance sampling) sampleMap: 26.969999 ms, 16182000 cycles, 30.947828 cycles/iteration DrawTerrain: 6.495000 ms, 3897000 cycles, 7.452953 cycles/iteration 8 - Screen rotation (draw in rows and applying tile rotation for the final draw in the screen) sampleMap: 23.782000 ms, 14269200 cycles, 30.088562 cycles/iteration DrawTerrain: 6.586000 ms, 3951600 cycles, 8.332490 cycles/iteration Rotate: 7.050000 ms, 4230000 cycles, 13.769531 cycles/iteration 9 - ARM SampleMap (sampleMap function moved to ARM assembly, we optimized the register usage) sampleMap: 26.461000 ms, 15876600 cycles, 33.477985 cycles/iteration DrawTerrain: 6.342000 ms, 3805200 cycles, 8.023786 cycles/iteration Rotate: 8.740000 ms, 5244000 cycles, 17.070312 cycles/iteration */