Performance

The SPMD programming model that ispc makes it easy to harness the computational power available in SIMD vector units on modern CPUs, while its basis in C makes it easy for programmers to adopt and use productively. This page summarizes the performance of ispc with the workloads in the examples/ directory of the ispc distribution.

These results were measured on a 4-core Apple iMac with a 4-core 3.4GHz Intel® Core-i7 processor using the Intel® AVX instruction set. The basis for comparison is a reference C++ implementation compiled with gcc 4.2.1, the version distributed with OS X 10.7.2. (The reference implementation is also included in the examples/ directory.)

Performance of ispc with a variety of the workloads from the examples/ directory of the ispc distribution, compared a reference C++ implementation compiled with gcc 4.2.1.
Workload ispc, 1 core ispc, 4 cores
AOBench (512 x 512 resolution) 6.19x 28.06x
Binomial Options (128k options) 7.94x 33.43x
Black-Scholes Options (128k options) 8.45x 32.48x
Deferred Shading (1280p) 5.02x 23.06x
Mandelbrot Set 6.21x 20.28x
Perlin Noise Function 5.37x n/a
Ray Tracer (Sponza dataset) 4.31x 20.29x
3D Stencil 4.05x 15.53x
Volume Rendering 3.60x 17.53x

The following table shows speedups for a number of the examples on a 2.40GHz, 40-core Intel® Xeon E7-8870 system with the Intel® SSE4 instruction set, running Microsoft Windows Server 2008 Enterprise. Here, the serial C/C++ baseline code was compiled with MSVC 2010.

Performance of ispc with a variety of the workloads from the examples/ directory of the ispc distribution, on system with 40 CPU cores.
Workload ispc, 40 cores
AOBench (2048 x 2048 resolution) 182.36x
Binomial Options (2m options) 63.85x
Black-Scholes Options (2m options) 83.97x
Ray Tracer (Sponza dataset) 195.67x
Volume Rendering 243.18x