Tag Archives: sah

COMP 770 Program 3: Ray Tracing Part 2

Download Project Source/Scenes/Mac-Binaries: Program3.tar.gz


For the first part of the Ray Tracing project, I added a quite a few extra features. One of those extra features was the recursive calculations of Specular Reflection, Dielectric Reflection, and Dielectric Transmission. I considered myself pretty lucky, considering that this feature is one of the two features that we were required to add for this second part. However, I wasn’t quite in the clear. Lets just say that I had a looser understanding of ray tracing than I thought.

New Features

Many of the features of my Ray Tracer were implemented in the first part. That list can be seen on that project post. The following are new features:

KD Tree

  • Mid-Axis Partitioning
  • SAH Partitioning
  • Cost-Based Termination of leaf nodes for both
  • Recursive KD Tree Traversal
  • KD Tree Printing in Debug Builds


  • Fixed Ray Tracing bugs
  • Dramatically Improved Ray Tracing Performance
  • Interactive Ray Shooting in Debug Builds

Default Configuration

When you launch my Ray Tracer, the following are the defaults that are used unless you specify otherwise.

  • Dimensions: 500×500
  • Sampling: 4 x 4 Adaptive Jittered Supersampling
  • Ray Casting: Blinn-Phong Lighting with an Ambient Factor
  • Ray Tracing: Specular Reflections, Dielectric Reflections, and Dielectric Transmission supported through recursion terminated based on a contribution threshold
  • Multi-Processing: Uses multiple CPU cores through OpenMP
  • KD Tree: Built using SAH cost analysis to determine best split and when to terminate branches

Ray Tracing

The optics involved with refraction are not very intuitive to me. Initially, I thought that an image seen through a glass sphere would be reduced, but instead it’s actually magnified. I had a rather serious bug when calculating refraction. I was refracting my rays with a dot product of the ray and surface normal with the wrong sign. Correcting that made a tremendous difference.

Once I started rendering the scene with 16 spheres, I started to realize that I had some serious additive errors on calculations of the transmissive component. There were two reasons for this. Firstly, I was calculating the reflective and transmissive components inside of the loop that calculated the phong shading for each light source. Fixing this corrected several of the bright spots, and it led to a signficant speed improvement.

Secondly, I was calculating the phong shading and reflective components for for illumination points inside of a sphere. This scenario arose whenever I was calculating the color for a refracted ray transmitted through a sphere. The refracted ray would intersect the other side of the sphere on the inside, and at that point I should have only been calculating the transmissive component. Making this change also led to a dramatic speedup.

KD Tree

My KD Tree was actually a deceleration structure for much of the project. I had several issues when creating the tree, as well as the traversal. I started by creating a KD Tree that simply divided the space at the midpoint of the split axis.

The first issue that I had, was that I was creating a new bounding box around the primitives in each of the newly split subspaces. This is very problematic, because it created overlapping spaces whenever I had the occurrence of a single primitive shared between both spaces. Once drew a clear delineation between space partitioning and bounding volume hierarchies, I was able to clean up my KD Tree and I saw fewer artifacts.

My next hurdle was understanding how to traverse the tree correctly and to fix the remaining artifacts. Initially, while I was trying to learn how KD Trees work, I was only considering an algorithm where the ray always intersected the outermost bounding box. This was fundamentally flawed for several reasons. First off, the ground sphere made the bounding box really large, but it didn’t create to many artifacts for me. The second major instance of rays originating inside of the outermost bounding box where theway used when calculating shadows, reflections, and transmissions. Through some digging, I found a very helpful post that illustrated the different cases that have to be handled when traversing a KD Tree.

His diagrams were very helpful. They were so implanted in my brain that I ended up adopting his algorithm completely from the code that he posted. He still missed two cases, that I initially had some trouble finding. I ended up implementing a special, interactive debug feature that would allow me to use the mouse to point at a pixel in the viewing window and result in the rendering of that single view ray. I would use it by rendering the entire scene, setting a breakpoint, then clicking on the pixel that I needed to test. This was invaluable in finding the remaining artifacts as well as some ray tracing issues.

At this point, my KD Tree still proved to be more of a deceleration structure. I fired up a profiler, and found a number of slowdowns related to mallocs when operating on C++ vectors. I reduced my use of vectors and passed them by reference throughout the KD Tree traversal. This brought significant gains, but my KD Tree was still slower.

The next step was to implement a smarter space partitioning scheme based on comparing the Surface Area Heuristic of each new subspace. This cost calculation was also critical to determining when to make a leaf node. I had a few bugs that led to excessive node duplication. Once I sorted those out, I finally got the gains I was hoping for. My resulting tree for the more complicated scene was 11 nodes deep, and contained several empty leaf nodes.

16 Sphere Scene Without a KD Tree

Size: 500x500
Total Primitive Intersection Checks: 223084940
Total Node Traversals: 0
Total Render: 29.763917 seconds

16 Sphere Scene With KD Tree

Size: 500x500
Build KD Tree: 0.000580 seconds
Total Primitive Intersection Checks: 55731412
Total Node Traversals: 161224507
Total Render: 24.878577 seconds

I reduced the number of sphere intersections from 225m to 161m while only adding 55m node traversals. The time savings wasn’t as dramatic as I had hoped, likely because my algorithm still used functional recursion instead of maintaining a smaller stack inside of a loop. For my final project, I’m likely going to go stackless altogether since I’ll be using the GPU too.

However, I really started to notice gains when I upped the complexity of the scene. I created a model of 140 reflective spheres arranged into a tightly-packed pyramid.

140 Sphere Scene Without KD Tree

Size: 500x500
Total Primitive Intersection Checks: 1016735174
Total Node Traversals: 0
Total Render: 103.413575 seconds

140 Sphere Scene With KD Tree

Size: 500x500
Build KD Tree: 0.091852 seconds
Total Primitive Intersection Checks: 230486448
Total Node Traversals: 161218665
Total Render: 38.800175 seconds

Sample Images