Tag Archives: raytracing

COMP 770 Program 3: Ray Tracing Part 2

Download Project Source/Scenes/Mac-Binaries: Program3.tar.gz


For the first part of the Ray Tracing project, I added a quite a few extra features. One of those extra features was the recursive calculations of Specular Reflection, Dielectric Reflection, and Dielectric Transmission. I considered myself pretty lucky, considering that this feature is one of the two features that we were required to add for this second part. However, I wasn’t quite in the clear. Lets just say that I had a looser understanding of ray tracing than I thought.

New Features

Many of the features of my Ray Tracer were implemented in the first part. That list can be seen on that project post. The following are new features:

KD Tree

  • Mid-Axis Partitioning
  • SAH Partitioning
  • Cost-Based Termination of leaf nodes for both
  • Recursive KD Tree Traversal
  • KD Tree Printing in Debug Builds


  • Fixed Ray Tracing bugs
  • Dramatically Improved Ray Tracing Performance
  • Interactive Ray Shooting in Debug Builds

Default Configuration

When you launch my Ray Tracer, the following are the defaults that are used unless you specify otherwise.

  • Dimensions: 500×500
  • Sampling: 4 x 4 Adaptive Jittered Supersampling
  • Ray Casting: Blinn-Phong Lighting with an Ambient Factor
  • Ray Tracing: Specular Reflections, Dielectric Reflections, and Dielectric Transmission supported through recursion terminated based on a contribution threshold
  • Multi-Processing: Uses multiple CPU cores through OpenMP
  • KD Tree: Built using SAH cost analysis to determine best split and when to terminate branches

Ray Tracing

The optics involved with refraction are not very intuitive to me. Initially, I thought that an image seen through a glass sphere would be reduced, but instead it’s actually magnified. I had a rather serious bug when calculating refraction. I was refracting my rays with a dot product of the ray and surface normal with the wrong sign. Correcting that made a tremendous difference.

Once I started rendering the scene with 16 spheres, I started to realize that I had some serious additive errors on calculations of the transmissive component. There were two reasons for this. Firstly, I was calculating the reflective and transmissive components inside of the loop that calculated the phong shading for each light source. Fixing this corrected several of the bright spots, and it led to a signficant speed improvement.

Secondly, I was calculating the phong shading and reflective components for for illumination points inside of a sphere. This scenario arose whenever I was calculating the color for a refracted ray transmitted through a sphere. The refracted ray would intersect the other side of the sphere on the inside, and at that point I should have only been calculating the transmissive component. Making this change also led to a dramatic speedup.

KD Tree

My KD Tree was actually a deceleration structure for much of the project. I had several issues when creating the tree, as well as the traversal. I started by creating a KD Tree that simply divided the space at the midpoint of the split axis.

The first issue that I had, was that I was creating a new bounding box around the primitives in each of the newly split subspaces. This is very problematic, because it created overlapping spaces whenever I had the occurrence of a single primitive shared between both spaces. Once drew a clear delineation between space partitioning and bounding volume hierarchies, I was able to clean up my KD Tree and I saw fewer artifacts.

My next hurdle was understanding how to traverse the tree correctly and to fix the remaining artifacts. Initially, while I was trying to learn how KD Trees work, I was only considering an algorithm where the ray always intersected the outermost bounding box. This was fundamentally flawed for several reasons. First off, the ground sphere made the bounding box really large, but it didn’t create to many artifacts for me. The second major instance of rays originating inside of the outermost bounding box where theway used when calculating shadows, reflections, and transmissions. Through some digging, I found a very helpful post that illustrated the different cases that have to be handled when traversing a KD Tree.

His diagrams were very helpful. They were so implanted in my brain that I ended up adopting his algorithm completely from the code that he posted. He still missed two cases, that I initially had some trouble finding. I ended up implementing a special, interactive debug feature that would allow me to use the mouse to point at a pixel in the viewing window and result in the rendering of that single view ray. I would use it by rendering the entire scene, setting a breakpoint, then clicking on the pixel that I needed to test. This was invaluable in finding the remaining artifacts as well as some ray tracing issues.

At this point, my KD Tree still proved to be more of a deceleration structure. I fired up a profiler, and found a number of slowdowns related to mallocs when operating on C++ vectors. I reduced my use of vectors and passed them by reference throughout the KD Tree traversal. This brought significant gains, but my KD Tree was still slower.

The next step was to implement a smarter space partitioning scheme based on comparing the Surface Area Heuristic of each new subspace. This cost calculation was also critical to determining when to make a leaf node. I had a few bugs that led to excessive node duplication. Once I sorted those out, I finally got the gains I was hoping for. My resulting tree for the more complicated scene was 11 nodes deep, and contained several empty leaf nodes.

16 Sphere Scene Without a KD Tree

Size: 500x500
Total Primitive Intersection Checks: 223084940
Total Node Traversals: 0
Total Render: 29.763917 seconds

16 Sphere Scene With KD Tree

Size: 500x500
Build KD Tree: 0.000580 seconds
Total Primitive Intersection Checks: 55731412
Total Node Traversals: 161224507
Total Render: 24.878577 seconds

I reduced the number of sphere intersections from 225m to 161m while only adding 55m node traversals. The time savings wasn’t as dramatic as I had hoped, likely because my algorithm still used functional recursion instead of maintaining a smaller stack inside of a loop. For my final project, I’m likely going to go stackless altogether since I’ll be using the GPU too.

However, I really started to notice gains when I upped the complexity of the scene. I created a model of 140 reflective spheres arranged into a tightly-packed pyramid.

140 Sphere Scene Without KD Tree

Size: 500x500
Total Primitive Intersection Checks: 1016735174
Total Node Traversals: 0
Total Render: 103.413575 seconds

140 Sphere Scene With KD Tree

Size: 500x500
Build KD Tree: 0.091852 seconds
Total Primitive Intersection Checks: 230486448
Total Node Traversals: 161218665
Total Render: 38.800175 seconds

Sample Images

COMP 770 Program 2: Ray Tracing Part 1

Download Project Source/Scenes/Mac-Binaries: Program2.tar.gz


I’ve read about ray tracing a few times in the past, but this assignment gave me a dramatically new perspective on the topic. Two things really struck me about ray tracing. First, what I understood to be ray tracing was actually just ray casting. I didn’t know this while I was implementing the diffuse shaders (pure Lambertian, Blinn-Phong, and Blinn-Phong with Ambience), and so I was rather impressed with the results. However, as soon as I implemented specular reflection via recursion, I started to realize that ray tracing is indeed a much more significant step over ray casting in terms of realism.

My second dramatic realization was just how expensive ray tracing is. Every feature that I added would drive my render times up. And this was compounded by framebuffer resolution, sampling grid size, number of lights, and number of scene primitives. I found myself switching between implementing new features and then going back and implementing various optimizations just to make the render times tolerable.


For part one of this ray tracer program, I used the COMP 575 assignment as a guideline on features to add beyond the minimum in the COMP 770 assignment. I kept going, adding feature after feature, unaware of whether or not these “extra features” would actually be required for the second part of the assignment or not.


  • Resizable View Rect Dimensions
  • Fully Configurable Camera (position, rotate, FOV)
  • Multiple, Colored Light Support
  • Configurable Background Color
  • Output to PNG
  • Supports both Ray Casting and Ray Tracing
  • Specular Reflection
  • Dielectric Reflection and Transmission w/ Refraction


  • Lambertian
  • Blinn-Phong
  • Blinn-Phong with Ambient

Super Sampling

  • Configurable Sample Count for Regular, Jittered, and Adaptive
  • Basic
  • Regular
  • Jittered


  • Multi-Processing Support with OpenMP
  • Simplified Scene Intersection Calculations with Normalized Direction Vectors
  • Tracks recursive contribution of color calculations for early recursion termination
  • Adaptive [Jittered] Sampling


  • Timing Instrumentation
  • Can build without OpenGL, OpenMP, and libpng for benchmarking on embedded systems

Building and Usage

I’ve provided a Makefile with the following targets. It has been tested on Mac OSX, Linux (Ubuntu) and Android for ARM.

NOTICE: By default, I build on a system with OpenGL, OpenMP (libgomp), and libpng. If you don’t have those on your system, then use the NO_GL=1, NO_OMP=1, and NO_PNG=1 settings when running make.

Build Targets/Options

  • make – Builds release.
  • make debug – Builds debug.
  • make clean – Cleans src directory and removes objdir directories.
  • make NO_OMP=1 – Won’t attempt to compile using OpenMP. Handy if the system doesn’t have support. Can be used with other NO_*.
  • make NO_PNG=1 – Won’t attempt to compile using libpng. Handy if the system doesn’t have support. Can be used with other NO_*.
  • make NO_GL=1 – Won’t attempt to compile using OpenGL. Handy if the system doesn’t have support. Can be used with other NO_*.


Usage: raytracer [-shader ] [-sampling ] [-samples ] [-background <0xRRGGBBAA>] [-window  ] [-timing] [-noparallel] [-norecursion] [-nodisplay] [-output ] 

       Sets the shader used. Each one builds upon the previous shader. Default = reflective
       Chooses which sampling method to use. Default = adaptive
       Specifies n x n grid of samples to collect. Ignored for basic sampling. Default = 5
  -background <0xRRGGBBAA>
       Sets the background color. Default = 0x000000ff
       Sets the window size to the specified width and height. Default = 500 x 500
       This switch turns on timing output on the console. Default = off
       This switch turns off multiprocessing. Default = on
       This switch turns off recursive ray tracing resulting in simple ray casting. Default = on
       This switch turns the output to the display. Default = on
       This switch causes the ray tracer to output a png image of the renderer scene.

Image Results


Once I started working on support for dielectrics, I wanted to create a reasonably familiar scene so that I could interpret the results better. I placed a large, mostly transparent, smoke-gray sphere in front of the camera. The ground sphere is still somewhat reflective, but the two colored spheres in the background are non-reflective. When viewing this scene, it’s best to use a white background (-background 0xffffffff) in order to see the distortion at the perimeter of the sphere.

	<!-- camera at (0,2,-12) pointed towards the origin -->
	<camera x="0.0" y="2.0" z="-8.0" fov="90.0"
			lookAtX="0.0" lookAtY="2.0" lookAtZ="0.0"
			upX="0.0" upY="1.0" upZ="0.0"/>
	<!-- smoked sphere -->
	<sphere radius="2.0" x="0.0" y="1.75" z="-3.0">
		<color r="0.3" g="0.3" b="0.3" a="0.3"/>
		<material reflectance="0.0" refraction="1.5" phongExponent="0.0"/>
	<!-- blue sphere -->
	<sphere radius="1.25" x="-4.0" y="2.0" z="2.0">
		<color r="0.0" g="0.0" b="1.0" a="1.0"/>
		<material reflectance="0.0" refraction="1.0" phongExponent="16.0"/>
	<!-- green sphere -->
	<sphere radius="1.25" x="4.0" y="2.0" z="2.0">
		<color r="0.0" g="1.0" b="0.0" a="1.0"/>
		<material reflectance="0.0" refraction="1.0" phongExponent="16.0"/>
	<!-- white overhead light -->
	<light x="0.0" y="5.0" z="0.0" ambient="0.25">
		<color r="1.0" g="1.0" b="1.0" a="1.0"/>
	<!-- "ground" sphere -->
	<sphere radius="50.0" x="0.0" y="-50.0" z="0.0">
		<color r="0.75" g="0.75" b="0.75" a="1.0"/>
		<material reflectance="0.3" refraction="1.5" phongExponent="32.0"/>


Of the various optimizations that I implemented, none provided the immediate results that the parallelism through OpenMP. I was definitely embarrassed that I didn’t think of it before the COMP 575 professor mentioned it. I fully anticipated that I would have to refactor my code to be multi-threaded. I was pleasantly surprised to find OpenMP. I had used a similar compiler extension on some Cell Processor development years ago, but OpenMP is much further along in terms of ease-of-use and compiler support. I was so thrilled when I discovered it, that I blogged about it here. With one line in my Makefile and two lines of code, I gained a nearly 75% increase.

Development Challenges

  • Viewing Rect
  • Floating Point Error When Intersecting Light Ray
  • Transparency + Ray Casting = Does Not Compute
  • Floating Point Error Part Deux

Overall, development went really smooth on this project. I was definitely making a lot of hand-gestures while trying to visualize where my cross products would be aiming as I was trying to generate the viewing rect. I didn’t know how to correlate the up vector of the camera with the vector from the camera position to the lookAt point, especially when they weren’t perpendicular. Finally I decided to use the up vector and assume that the camera was looking straight forward, along it’s z axis. Without this assumption, I felt like I would have to be dealing with an oblique projection, which I wasn’t ready for.

I struggled a little when I was trying to calculate intersections with the scene for the light vector from the visible point back to the light source. Initially I tried to throw out intersections with the primitive that the visible point was on. Of course, that didn’t yield results, so I finally settled on throwing out all intersections that were closer to the visible point than a particular threshold, lambda. The next lecture, the same strategy was mentioned as a solution to that problem.

The next significant problem that I phased was how to deal with transparency. Again, I was unlucky enough to be a little early to implement this feature. Two lectures later, we discussed ray tracing vs. ray casting. Recursive ray tracing, makes reflection and transmission with refraction nearly trivial. For a while, I was a little confused between specular reflection and dielectric reflection, but I finally differentiated the two and accepted the fact that an object can be a dielectric and also have specular reflective material properties. The last part of ray tracing that was really challenging, was the calculation of the “a” constant when determining the filtering of light when it is transmitted through a dielectric. The textbook described how the Beer-Lambert Law determined how much light is transmitted, but they said that a constant for each color channel is chosen and that the natural logs from the formula are rolled into that. Furthermore, they mentioned that developer’s often tune this parameter by eye. I settled on a calculation for “a” that took each color channel of the intersected primitive and multiplied it with (1 – alpha) for that color. Visually, I found the results to be satisfactory.

The last hurdle that I faced was again related to detecting intersections that are too close to the visible point. This time, the rays in question were the transmitted/refracted rays. I was still using the threshold from before to eliminate intersections that were too close to the ray origin. However, the threshold value that I was using was very small. I found that several of the refraction calculations involved a lot of floating point math errors that had built up through the multiple calculations and amplified by the recursion. I just relaxed the threshold and the noise was eliminated.