Monthly Archives: September 2010

COMP 770 Program 2: Ray Tracing Part 1

Download Project Source/Scenes/Mac-Binaries: Program2.tar.gz


I’ve read about ray tracing a few times in the past, but this assignment gave me a dramatically new perspective on the topic. Two things really struck me about ray tracing. First, what I understood to be ray tracing was actually just ray casting. I didn’t know this while I was implementing the diffuse shaders (pure Lambertian, Blinn-Phong, and Blinn-Phong with Ambience), and so I was rather impressed with the results. However, as soon as I implemented specular reflection via recursion, I started to realize that ray tracing is indeed a much more significant step over ray casting in terms of realism.

My second dramatic realization was just how expensive ray tracing is. Every feature that I added would drive my render times up. And this was compounded by framebuffer resolution, sampling grid size, number of lights, and number of scene primitives. I found myself switching between implementing new features and then going back and implementing various optimizations just to make the render times tolerable.


For part one of this ray tracer program, I used the COMP 575 assignment as a guideline on features to add beyond the minimum in the COMP 770 assignment. I kept going, adding feature after feature, unaware of whether or not these “extra features” would actually be required for the second part of the assignment or not.


  • Resizable View Rect Dimensions
  • Fully Configurable Camera (position, rotate, FOV)
  • Multiple, Colored Light Support
  • Configurable Background Color
  • Output to PNG
  • Supports both Ray Casting and Ray Tracing
  • Specular Reflection
  • Dielectric Reflection and Transmission w/ Refraction


  • Lambertian
  • Blinn-Phong
  • Blinn-Phong with Ambient

Super Sampling

  • Configurable Sample Count for Regular, Jittered, and Adaptive
  • Basic
  • Regular
  • Jittered


  • Multi-Processing Support with OpenMP
  • Simplified Scene Intersection Calculations with Normalized Direction Vectors
  • Tracks recursive contribution of color calculations for early recursion termination
  • Adaptive [Jittered] Sampling


  • Timing Instrumentation
  • Can build without OpenGL, OpenMP, and libpng for benchmarking on embedded systems

Building and Usage

I’ve provided a Makefile with the following targets. It has been tested on Mac OSX, Linux (Ubuntu) and Android for ARM.

NOTICE: By default, I build on a system with OpenGL, OpenMP (libgomp), and libpng. If you don’t have those on your system, then use the NO_GL=1, NO_OMP=1, and NO_PNG=1 settings when running make.

Build Targets/Options

  • make – Builds release.
  • make debug – Builds debug.
  • make clean – Cleans src directory and removes objdir directories.
  • make NO_OMP=1 – Won’t attempt to compile using OpenMP. Handy if the system doesn’t have support. Can be used with other NO_*.
  • make NO_PNG=1 – Won’t attempt to compile using libpng. Handy if the system doesn’t have support. Can be used with other NO_*.
  • make NO_GL=1 – Won’t attempt to compile using OpenGL. Handy if the system doesn’t have support. Can be used with other NO_*.


Usage: raytracer [-shader ] [-sampling ] [-samples ] [-background <0xRRGGBBAA>] [-window  ] [-timing] [-noparallel] [-norecursion] [-nodisplay] [-output ] 

       Sets the shader used. Each one builds upon the previous shader. Default = reflective
       Chooses which sampling method to use. Default = adaptive
       Specifies n x n grid of samples to collect. Ignored for basic sampling. Default = 5
  -background <0xRRGGBBAA>
       Sets the background color. Default = 0x000000ff
       Sets the window size to the specified width and height. Default = 500 x 500
       This switch turns on timing output on the console. Default = off
       This switch turns off multiprocessing. Default = on
       This switch turns off recursive ray tracing resulting in simple ray casting. Default = on
       This switch turns the output to the display. Default = on
       This switch causes the ray tracer to output a png image of the renderer scene.

Image Results


Once I started working on support for dielectrics, I wanted to create a reasonably familiar scene so that I could interpret the results better. I placed a large, mostly transparent, smoke-gray sphere in front of the camera. The ground sphere is still somewhat reflective, but the two colored spheres in the background are non-reflective. When viewing this scene, it’s best to use a white background (-background 0xffffffff) in order to see the distortion at the perimeter of the sphere.

	<!-- camera at (0,2,-12) pointed towards the origin -->
	<camera x="0.0" y="2.0" z="-8.0" fov="90.0"
			lookAtX="0.0" lookAtY="2.0" lookAtZ="0.0"
			upX="0.0" upY="1.0" upZ="0.0"/>
	<!-- smoked sphere -->
	<sphere radius="2.0" x="0.0" y="1.75" z="-3.0">
		<color r="0.3" g="0.3" b="0.3" a="0.3"/>
		<material reflectance="0.0" refraction="1.5" phongExponent="0.0"/>
	<!-- blue sphere -->
	<sphere radius="1.25" x="-4.0" y="2.0" z="2.0">
		<color r="0.0" g="0.0" b="1.0" a="1.0"/>
		<material reflectance="0.0" refraction="1.0" phongExponent="16.0"/>
	<!-- green sphere -->
	<sphere radius="1.25" x="4.0" y="2.0" z="2.0">
		<color r="0.0" g="1.0" b="0.0" a="1.0"/>
		<material reflectance="0.0" refraction="1.0" phongExponent="16.0"/>
	<!-- white overhead light -->
	<light x="0.0" y="5.0" z="0.0" ambient="0.25">
		<color r="1.0" g="1.0" b="1.0" a="1.0"/>
	<!-- "ground" sphere -->
	<sphere radius="50.0" x="0.0" y="-50.0" z="0.0">
		<color r="0.75" g="0.75" b="0.75" a="1.0"/>
		<material reflectance="0.3" refraction="1.5" phongExponent="32.0"/>


Of the various optimizations that I implemented, none provided the immediate results that the parallelism through OpenMP. I was definitely embarrassed that I didn’t think of it before the COMP 575 professor mentioned it. I fully anticipated that I would have to refactor my code to be multi-threaded. I was pleasantly surprised to find OpenMP. I had used a similar compiler extension on some Cell Processor development years ago, but OpenMP is much further along in terms of ease-of-use and compiler support. I was so thrilled when I discovered it, that I blogged about it here. With one line in my Makefile and two lines of code, I gained a nearly 75% increase.

Development Challenges

  • Viewing Rect
  • Floating Point Error When Intersecting Light Ray
  • Transparency + Ray Casting = Does Not Compute
  • Floating Point Error Part Deux

Overall, development went really smooth on this project. I was definitely making a lot of hand-gestures while trying to visualize where my cross products would be aiming as I was trying to generate the viewing rect. I didn’t know how to correlate the up vector of the camera with the vector from the camera position to the lookAt point, especially when they weren’t perpendicular. Finally I decided to use the up vector and assume that the camera was looking straight forward, along it’s z axis. Without this assumption, I felt like I would have to be dealing with an oblique projection, which I wasn’t ready for.

I struggled a little when I was trying to calculate intersections with the scene for the light vector from the visible point back to the light source. Initially I tried to throw out intersections with the primitive that the visible point was on. Of course, that didn’t yield results, so I finally settled on throwing out all intersections that were closer to the visible point than a particular threshold, lambda. The next lecture, the same strategy was mentioned as a solution to that problem.

The next significant problem that I phased was how to deal with transparency. Again, I was unlucky enough to be a little early to implement this feature. Two lectures later, we discussed ray tracing vs. ray casting. Recursive ray tracing, makes reflection and transmission with refraction nearly trivial. For a while, I was a little confused between specular reflection and dielectric reflection, but I finally differentiated the two and accepted the fact that an object can be a dielectric and also have specular reflective material properties. The last part of ray tracing that was really challenging, was the calculation of the “a” constant when determining the filtering of light when it is transmitted through a dielectric. The textbook described how the Beer-Lambert Law determined how much light is transmitted, but they said that a constant for each color channel is chosen and that the natural logs from the formula are rolled into that. Furthermore, they mentioned that developer’s often tune this parameter by eye. I settled on a calculation for “a” that took each color channel of the intersected primitive and multiplied it with (1 – alpha) for that color. Visually, I found the results to be satisfactory.

The last hurdle that I faced was again related to detecting intersections that are too close to the visible point. This time, the rays in question were the transmitted/refracted rays. I was still using the threshold from before to eliminate intersections that were too close to the ray origin. However, the threshold value that I was using was very small. I found that several of the refraction calculations involved a lot of floating point math errors that had built up through the multiple calculations and amplified by the recursion. I just relaxed the threshold and the noise was eliminated.

Online Shopping Costs Carbon

Like many people, I find myself doing more of my shopping on Amazon and other e-tailers. I just can’t justify the brick-and-mortar tax at my local big-box retailers. Besides, who has time to go shopping anyhow?

Am I just being lazy? More importantly, what’s the environmental impact of my laziness?

Extra packaging

When goods are shipped to big-box retailers, their retail packages are boxed up on pallets. Hopefully they box them up better than this example. They are generally handled with care, so this seemingly crude boxing isn’t usually a problem. However, when that same retail packages is shipped to you through an e-tailer, it’s repackaged into a larger box filled with packing material. Hopefully you have a an option to recycle that corrugated cardboard and packing material, but regardless, it’s still rather wasteful.

An example of a pallet packed with electronics

Lots of Extra Fuel

These pallets are loaded onto trucks, freight trains, and cargo ships with finite storage volume. Therefore, it’s advantageous that the pallets are packed densely. They’re transported to your local big-box, at which point you typically make small trips to pick up your favorite items.

In the online space, e-tailers receive these same pallets at their distributions centers. There are far few distribution centers, so for the final leg on board semi-trucks, this saves on fuel. However, that’s where it ends. The e-tailer must open the pallets and put the individual retail packages into their bigger boxes and those big boxes are picked up by your favorite carrier services.

Each package is bundled tightly into something similar to a pallet, but at this point they are taking up way more space. More space means more trips by the carrier. Furthermore, a lot of these trips have long portions by air since we just can’t seem to wait a few extra days.

Study Confirms My Intuition

I’m really starting to thing that it’s more efficient for you to shop at a big-box. And it looks like I’m not the only one. This study seems to confirm my intuition.

Shouldn’t Shipping And Handling Cost More

Unfortunately, shipping and handling is way to cheap. Especially with services like Amazon Prime. These carriers are awfully efficient, but in a sense they’re being subsidized in the form of cheap fuel. I’m not saying that they pay less for fuel (although they likely do because they buy it in bulk). I’m actually saying that fuel in America is way to cheap for everyone. If it was more appropriately priced to account for the environmental damage of fossil fuels, then clearly e-shopping would be more expensive and we would all get our lazy selves to our favorite big-box.

Here's my favorite.

Finding OpenMP – Best Day EVAR, This Week

So I’ve been working on this Ray Tracer for class. It’s rather primitive, but it’s starting to slow to a crawl as I add features (especially multi-sampling). I’m still only rendering three spheres, but a full screen render at 1920×1066 was taking 13+ seconds. The professor gave us a few tips today on speeding up the algorithms involved. I felt pretty dumb for not thinking of the most obvious recommendation that he made. And wouldn’t you know it, it turned out stunning results.

Only spheres so far, but check out that Blinn-Phong w/ Ambient Shading

OpenMP – Multi-Processing Made Real Easy

Of course, Ray Tracing is ridiculously parallel. You have to do the same calculations on every ray. That’s at least one ray per pixel in your viewing plane. Ten minutes of searching the internet for multi-processing methodologies in C/C++ delivered me to the OpenMP Wikipedia Page. It’s a very friendly introduction. I’ve used something lightly similar years ago for IBM’s Cell processor, but it wasn’t nearly as easy as OpenMP.

I doubted that my gcc compiler on my MacBook Pro supported OpenMP, but digging through the OpenMP compiler page revealed that there has been OpenMP 2.5 support since gcc 4.2 and I’m running gcc 4.2.1.

There’s a note on the same OpenMP compiler page that says “Compile with -fopenmp”. At first I thought that I’d have to rebuild gcc with that flag. Dummy. All I need to do was to use that feature flag when compiling with my version of gcc.

So here it is. My three lines of code.

Adding -fopenmp to Makefile


Including omp.h in RayTracer.cpp


Putting a #pragma above my outermost for loop

#pragma omp parallel for



  • Size: 1920×1066 – Render time: 13.178630 seconds


  • Size: 1920×1066 – Render time: 7.544398 seconds

Professionally, I work on a performance and optimization team in the embedded space. Usually when I see such gains with just a handful of lines of code, it’s because I finally remembered that I was building DEBUG.

COMP 770 Program 1: Image Processing

Download Entire Project: Image_cpp.tar.gz
Download image.cpp only: image.cpp.tar.gz
Download extra sample image results: lena.tar.gz


This assignment was a blast. It basically required us to implement just about every worthwhile image processing filter out there. Here’s a quick list of them:

  • Brighten
  • Add Random Noise
  • Crop
  • Extract Channel
  • Adjust Contrast
  • Sharpen
  • Blur
  • Quantize
  • Dither (Random, Ordered, Floyd-Steinberg)
  • Scale
  • Rotate
  • Fun (Swirl)
  • Nonphotorealism (Toon/Comic Shading)

A skeleton of the project was provided with helpful WBMP code as well as classes for Image and Pixel. The Image::Image (const Image& src) was busted, but other than that everything worked perfectly. For reference, here’s my modifications to the copy contructor:

Image::Image (const Image& src)
  bmpImg = new BMP();
  width           = src.width;
  height          = src.height;
  num_pixels      = width * height;
  sampling_method = src.sampling_method;
  if (!bmpImg->SetSize(width, height)){
    printf("Error allocating image.");
  pixels = new Pixel[num_pixels];
  memcpy(pixels, src.pixels, num_pixels * sizeof(Pixel));

  assert(pixels != NULL);

Getting The Code

The various image filters were implemented as methods of the Image class and can be viewed in the image.cpp file. Further details of the implementation are explained in the comments.

Download Entire Project: Image_cpp.tar.gz
Download image.cpp only: image.cpp.tar.gz


The project should build on any system with c++ compiler, stdlibc++ library, and make. I did my development on Linux and Mac OSX. I don’t anticipate any problems building in windows, but please let me know if it does.

Untar the project it in a convenient directory and run make to build:

$ tar -xzf Image_cpp.tar.gz
$ cd Image_cpp
$ make

The debug and clean targets work as well:

$ make debug
$ make clean

Run the program with no parameters to get usage:

$ objdir-release/image
Usage: image -input  [-option [arg ...] ...] -output 

Sample Images
The following are screen shots of the results of the various image processing filters. The command lines that were used are displayed as captions beneath the images. You can read more details about how they work after the gallery.

Noise, Brightness, Contrast, Saturation
This group of image filters all use interpolation at their core. Specifically:

  • Noise – Interpolation with an image of random noise
  • Brightness – Interpolation with a black image
  • Contrast – Interpolation with constant gray image with luminance equal to the average luminance of the source image
  • Saturation – Interpolation with a grayscale version of the source image


Cropping is a simple blit of the specified pixel region no a new image with the region’s dimensions.

Extract Channel

The specified channel was extracted by setting the other channels to zero for each pixel in the source image.

Quantize, Random Dither, Ordered Dither, Floyd-Steinberg Dither

This groups of image filters reduces the bits per channel to the number specified by the input parameter. The Quantize algoritm is the simplest, as it just reduces each channel of each pixel. This produces dramatic Intensity Quantization errors.

Random Dithering breaks up the color boundaries in the image by adding noise to the source image first. It isn’t simply added though; instead the source image is interpolated with a random noise image. When the resulting image is Quantized the color boundaries look dramatically better, but the introduced noise negatively affects the image quality.

Ordered Dithering improves the color boundaries by moving the rounding errors when choosing one pixel over the next. Instead of simply rounding, the most significant of the dropped bits is compared to a value in a bayer matrix chosen based on the pixel’s coordinates. The pixel’s color is rounded up or down if the error is greater or lesser than that error. More details can be found in this Wikipedia article. This produces visual quality much higher than random dithering, but a noticeable repeating pattern emerges in the image based on the size of the matrix used.

Floyd-Steinberg Dithering deals with the rounding errors differently. Portions of the error are distributed, through additions/subtractions, to the pixels around the pixel being rounded. The resulting images shows dithering quality better than ordered dithering without any repeating patterns.

Scale, Rotate, and Fun (Swirl)

These image processing filters use transformation to move the pixels from the source image to new coordinates in the destination image. Instead of using calculations that transform the source pixels to the destination pixels, the reverse calculations are used. This is particularly important when scaling to a larger dimension, for instance.

The calculations in these reverse transformations often result in real numbers instead of discrete coordinates in the source image. Therefore the image needs to be resampled in order to approximate what the value should be at the given real coordinates. Point Sampling finds the nearest pixel and uses that value. It produces heavily aliasing in the destination image. Bilinear Sampling interpolates between the four nearest pixels based on the the real coordinates distance from those pixels. Gaussian Sampling uses a gaussian filter, producing a mild blurring effect that can be controlled with the right radius.

Blur/ Sharpen

Blur uses a Gaussian filter. I implemented it using a one-dimensional kernel, taking advantage of the separability of the Gaussian filter. The size of the kernel is passed as the parameter on the command line. The size is radius * 2 of the Gaussian curve used to generate the real values in the kernel. I calculated the optimal standard deviation for the equation so that the blurring effect would grow as the radius grew.

Sharpening removes the blurriness of a source image by extrapolating the source image from a blurred version of the same image.

Edge Detect

Edge Detect works similar to the Gaussian filter, but it uses a specific edge detection kernel. I implemented it with a two-dimensional kernel because I wasn’t sure if it was separable and it wasn’t too costly to implement due to the small size of the kernel (3×3).


Perhaps inspired by all of those “Comic book” yourself applications out there, I attempted to create a “Toon” filter. I built upon the image filters created already and added an alpha blender. I started the process by creating the “color layer”. First the image was saturated and brightened a little. Then it was quantized.

The “ink layer” is built by edge detecting the color layer and then desaturating completely. Then each pixel of the ink layer is analyzed. If its near white, then it is made completely white (for debugging) and completely transparent. The other pixels make up the ink, and so they are made completely black and opaque.

Finally the ink layer is composited over the color layer and the results is blurred a little for effect.