Optimizing compiler. Auto parallelization
C/C++ extended array notation
C/C++ language extension for array notations is an Intel-specific language extension that is a part of Intel® Cilk™ Plus feature supported by the Intel® compiler.
The C/C++ extension provides data parallel array notations with the following major benefits:
- Allows you to use array notation to program parallel operations in a familiar language
- Achieves predictable performance based on mapping parallel constructs to the underlying multi-threaded and SIMD hardware
- Enables compiler parallelization and vectorization with less reliance on alias and dependence analysis
When you use the array notations, the Intel® compiler implements them using vector code.
Usage Recommendations
Use the array notations when your algorithm requires operations on arrays and where it does not require a specific order of operations among the elements of the array(s).
To use the array notations in your application, keep the following sequence of steps in mind:
Insert the array notations language extensions into your application source code.
Compile the application at optimization level –O1 and above to enable vectorization. By default, the compiler generates SIMD vector instructions in the SSE2 instruction set. To generate SIMD vector instructions beyond SSE2, you can add target/architecture-specific compiler options to the compile command.
By default, the Intel® compiler accepts the array notations language extensions to generate vector and multi-threaded code based on the data parallel constructs in the program.
CEAN (C/C++ Extensions for Array Notations Programming Model)
Declaration of the array sections
section_operator :: = [<lower bound>:<length>:<stride>] a[0:3][0:4] b[0:2:3]
You must use –std=c99 (Linux и MAC OS) or /Qstd=c99 compiler options
typedef int (*p2d)[128]; p2d p = (p2d) malloc (sizeof(int)*rows*128); p[0:rows][:]
Most of C/C++ operators are available for array sections.
a[:]*b[:] // element-wise multiplication a[3:2][2:2] + b[5:2][5:2] // matrix addition a[0:4]+c // adds scalar to an array section a[:][:] = b[:][1][:] + c // array assignment
#include <stdio.h> #include <stdlib.h> #define N 2000 typedef double (*p2d)[]; void matrix_mul(int n, double a[n][n], double b[n][n],double c[n][n]) { int i,j; a[:][:] =1; b[:][:] =-1; for(i=0;i<n;i++) for(j=0;j<n;j++) c[i][j]=c[i][j]+ __sec_reduce_add(a[i][:]*b[:][j]); return; }
int main() { p2d a= (p2d)malloc(N*N*sizeof(double)) ; p2d b= (p2d)malloc(N*N*sizeof(double)) ; p2d c= (p2d)malloc(N*N*sizeof(double)); matrix_mul(N,a,a,a); matrix_mul(N,a,b,c); free(a); free(b); free(c); }