Examples for parallelism: ray tracer on the GPU

I tested the parallel ray tracer on all NVIDIA cards i bought. This page contains the results.

Chip Größe Threadblock FPS intern FPS mit fraps
285 640x480 16x16 254.1 --
285 1920x1028 16x16 57.3 --
580 640x480 16x16 519 --
580 1920x1148 16x16 98 --
680 640x480 16x16 690 --
680 1920x1028 16x16 118.4 --
780 OC 640x480 16x8 795 650
780 OC 1920x1080 16x8 155.8 150
780 OC 2560x1440 16x8 89.6 85
780 OC 3840x2160 16x8 26 25
970 640x480 16x8 -- 680
970 1920x1080 16x8 -- 150
970 2560x1440 16x8 -- 85
970 3840x2160 16x8 -- 40

Remark: I measured the fps with two different tools:

  1. With the example code from NVIDIA from the CUDA distribution.
  2. With the program fraps

The results were always different, because fraps measures the whole time from image to image and the NVIDIA timer only the CUDA part without the OpenGL parts. So the FPS value measured by fraps is less than the value measured by the NVIDIA timer.

285 - Tesla - CC 1.3 (2010)

580 - Fermi - CC 2.0 (2011)

680 - Kepler - CC 3.0 (2012)

780 - Kepler - CC 3.5 (2013)

970 - Maxwell - CC 5.2 (2014)

 "Examples for parallelism: ray tracer" "Gute Einführung in C++ AMP"