An Evaluation of Current SIMD Programming Models for C++
Published at WPMVP '16
SIMD extensions were added to microprocessors in the mid '90s to speed-up data-parallel code by vectorization. Unfortunately, the SIMD programming model has barely evolved and the most efficient utilization is still obtained with elaborate intrinsics coding. As a consequence, several approaches to write efficient and portable SIMD code have been proposed. In this work, we evaluate current programming models for the C++ language, which claim to simplify SIMD programming while maintaining high performance.
The proposals were assessed by implementing two kernels: one standard floating-point benchmark and one real-world integer-based application, both highly data parallel. Results show that the proposed solutions perform well for the floating point kernel, achieving close to the maximum possible speed-up. For the real-world application, the programming models exhibit significant performance gaps due to data type issues, missing template support and other problems discussed in this paper.
Link to full paper