Kaihua Hu
Masters of Science Capstone Project, June 2020
Ray tracing is an important rendering algorithm that naturally supports advanced effects such as realistic reflections, refraction, and shadows. It is capable of synthesizing images with striking realism that are comparable to photographs or videos captured in the real world. However, due to the intensive computation requirements and limitations of traditional hardware, ray-trace image generation has largely been limited to off-line batch processing. Recently, with the increasing performance of hardware, the latest GPUs (graphics processing units) are becoming capable of meeting the computational requirements of ray tracing in real-time.
In this project, we implemented a real-time ray-tracing system based on the Unity game engine platform. Our system delivers typical rendering features via a ray-tracing pipeline, where the features include camera manipulations, anti-aliasing, environmental mapping, and supports for customized mesh models, materials, light sources, Phong illumination, reflection, and shadow. Additionally, our system also provides a simple UI (user interfaces) for Unity users to configure the rendering process in real-time. For example, the user can define a skybox and increase the number of samples for anti-aliasing. They can even modify the material properties and generation of reflection rays, all in real-time, while the ray trace rendering system is running.
We have carefully examined the performance of our real-time ray-tracing system and identified bottlenecks. Somewhat surprisingly, the performance results indicated that the communication between the CPU (central processing unit) and GPU was not a limiting factor. Instead, as expected in all ray-tracing systems, the number of ray intersection computations was the main issue. In our case, the vast proportion of frame time was spent by the GPU on computing intersections. This bottleneck was relieved with a BVH-tree (bounding volume hierarchy tree) spatial acceleration structure. With the acceleration structure, the bottleneck of the system switched to the BVH-tree construction computation in the CPU. We addressed this issue by applying a lazy updating strategy for the BVH-tree construction. With these optimizations, based on our benchmark scenes, the resulting system was able to achieve a speed-up of over 100-times, from rendering 144 triangles to 17,000 triangles in real-time.