Опубликован: 12.07.2012 | Доступ: свободный | Студентов: 355 / 24 | Оценка: 4.00 / 4.20 | Длительность: 11:07:00
Специальности: Программист
Лекция 4:

Optimizing compiler. Loop optimizations

< Лекция 3 || Лекция 4: 123456 || Лекция 5 >

The fundamental dependency theorem:

Each optimization which preserves the dependencies in the program (i.e. does not change the order of the dependent claims) produces equivalent calculation.

Accordingly, some transformation is valid in this program if it preserves all the dependencies in the program.

How to determine the dependences in the case of a single array?

Assume that we have a set of nested loops

DO i1 = 1, N1
  DO i2 = 1, N2
    ...
      DO in = 1, Nn
S1 A (f1 (i1, ..., in), ..., fm (i1, ..., in)) =
        A (g1 (i1, ..., in), ..., gm (i1, ..., in))
      END DO
    ...
  END DO
END DO 

Dependence exists if and only if there are iteration vectors I and J, such that

I <J and the following system of equations:

fi (I) = gi (J)

can be solved

Example:

DO I = 1, N
  A (I +1) = A (I) + B
END DO
I +1 = I + x

Dependency evaluation is complicated computational task even for single array usage.

Optimizing compiler uses different methods for proving permissibility of different permutational optimizations. In case when loop contains a lot of different arrays and pointers, this task can be very hard. Estimation can be used instead of precise calculation.

Alias analysis

Alias analysis is a technique used to determine if a storage location may be accessed in more than one way. Two pointers are said to be aliased if they point to the same location.

In order to properly find the dependence in the program is important to remove any memory ambiguity. That is, to identify all objects that may overlap in memory.

Optimizing compiler uses all possible information for decision making:

  • features of the language
  • results of interprocedural analysis
  • Local Point To analysis, etc.

If there are objects which compiler can not prove to have different locations, then compiler needs to act conservatively and forbid permutation optimization.

File sub.c

int sub(int *a, float *b, int n) {
int i;
for(i=0;i<n;i++) {
   a[i]=0;
}
for(i=0;i<n;i++){
   b[i]=0.0;
} 
}

File main.c

#include <stdio.h>
#include <stdlib.h>

extern void sub(int *a,float *b, int n);
int main(){
int *a;
float *b;
a=(int*)malloc(100*sizeof(int));
b=(float*)malloc(100*sizeof(int));

sub(a,b,100);
printf("%d;%f\n",a[0],b[0]);
}

Рис. 4.20.
icc main.c 2.c –O3 
2.c(3): (col. 1) remark: LOOP WAS VECTORIZED.
2.c(6): (col. 1) remark: LOOP WAS VECTORIZED.

icc main.c 2.c –O3 –ansi-alias
2.c(3): (col. 1) remark: FUSED LOOP WAS VECTORIZED.

What's the difference?

-[no-]ansi-alias enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules.

ANSI aliasing rules require that the pointers can refer only to the objects of the same or compatible type. This means in practice that the pointers of incompatible type can not address the same memory.

Using –anti-alias option at compile time allows compiler to perform more aggressive optimizations.

There are special attributes in C/C++ language to facilitate the work of the compiler with the alias analysis. Pointers can be declared as restrict. This attribute means that only a pointer or an expression based on this pointer can refer to memory which is related with this pointer.

int sub (int * a, float * b, int n)
=>
int sub (int * restrict a, float * restrict b, int n)

Fortran language has stronger rules for pointers. Although the language has pointers to arrays, but each array, which can be referenced through the pointer must be explicitly declared as TARGET. By default, function arguments can’t refer to the same location. These rules simplify alias analysis.

Compiler contains special option to show the results of different optimizations /Qopt-report[:n]

  1. generate an optimization report to stderr
  2. 0 disable optimization report output
  3. 1 minimum report output
  4. 2 medium output (DEFAULT when enabled)
  5. 3 maximum report output

Example:

LOOP INTERCHANGE in loops at line: 8 9 
Loopnest permutation ( 1 2 ) --< ( 2 1 )
Fusion loop partitions: (loop line numbers)
Fused Loops: ( 9 14 )
< Лекция 3 || Лекция 4: 123456 || Лекция 5 >