Most important Intel Compiler Options and Directives
Compiler Options
Option | Description |
---|---|
-O0 | Disables all optimizations. Recommended for program development and debugging |
-O1 | Enables optimization for speed, while being aware of code size (e.g no loop unrolling) |
-O2 | Default optimization. Optimizations for speed, including global code scheduling, software pipelining, predication, and speculation. |
-O3 | -O2 optimizations plus more aggressive optimizations such as prefetching, scalar replacement, and loop transformations. Enables optimizations for technical computing applications (loop-intensive code): loop optimizations and data prefetch. |
-Oi | Inline expansion of intrinsic functions |
-xcode | SSE4.2: On the Westmeer Fat Nodes of SuperMUC Phase1: generate SSE4.2 instructions host: Tells the compiler to generate instructions for the highest instruction set available on the compilation host |
-xcode | SANDYBRIDGE, HASWELL, KNL, SKYLAKE-512: May generate instructions for processors that support the specified Intel® microarchitecture code name. These keywords are only available for Intel compilers from 18.0 and higher. |
-axcode1,code2 | This option tells the compiler to generate multiple, processor-specific auto-dispatch code paths for Intel processors if there is a performance benefit. It also generates a baseline code path which can run on non-AVX processors. The Intel processor-specific auto-dispatch path is usually more optimized than the baseline path. May generate Intel(R) Advanced Vector Extensions 2 (AVX2), AVX, SSE4.2, SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions for Intel(R) processors. |
qopt-zmm-usage= [low|high] | low: Tells the compiler that the compiled program is unlikely to benefit from zmm registers usage. It specifies that the compiler should avoid using zmm registers unless it can prove the gain from their usage (default for CORE-AVX512) high: Tells the compiler to generate zmm code without restrictions (default for COMMON-AVX512) |
-fno_alias | Specifies that aliasing should not be assumed in the program. Allows the compiler to generate faster code. |
-ftz | Enables flush denormal results to zero (default with -O3) |
-ipo | Enables interprocedural (IP) optimizations, e.g. inline function expansion for calls to functions defined in separate files |
-p | Compiles and links for function profiling with gprof. |
-prof_use | Use formerly collected profiling information during optimization |
-g | Produces a symbol tables, i.e. line numbers for profiling are available. |
-openmp | Enables the parallelizer to generate multithreaded code based on OpenMP directives. |
-parallel | Tells the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. To use this option, you must also specify -O2 or -O3. |
-opt_report | generate an optimization report to stderr. |
Compiler Directives for the Intel compiler
The following table shows the source code directives as supported by the Intel Fortran compiler to help with tuning or debugging applications. Note that for fixed source format the "!" comment symbol in the first column needs to be replaced with a "c" comment symbol.
Directive | Meaning |
---|---|
| Ignore vector dependencies |
| Software pipelining hint |
| Split large loop |
| Unroll inner loop N times. Compiler heuristics used if N omitted. |
| Do not unroll loop |
| Prefetch Array A |
| Do not prefetch array A |
| Vectorize loop, CLAUSE = { ALWAYS [ASSERT]|ALIGNED|UNALIGNED|TEMPORAL|NONTEMPORAL [(var1 [, var2]...)] } For further details please see Compiler Documentations. |
| Do not vectorize loop. |