Compilers: Intel(R) C++ Compiler and Intel(R) Visual Fortran Compiler
for applications running on Intel(R) 64, Version 10.1
Operating system: Windows Vista Ultimate, x64 Version
Last updated: 02-Jun-2008
The text for many of the descriptions below was taken
from the documentation of the Intel Compilers.
This documentation is copyright © 2007 Intel Corporation. All Rights Reserved.
The original documentation is distributed with the Intel compilers.
Selecting one of the following will take you directly to that section:
HEADER for OPTIMIZATION
This option instruments a program for profiling as first step in Profile Guided Optimization.
Profile Guided Optimization (PGO) consists of 3 phases:
Phase 1: Compile and generate instrumented code in preparation
to gather profiling information (compiler flag -Qprof_gen).
Phase 2: Execute the instrumented code and gather profiling information.
Phase 3: Recompile the code and use the profiling information
for improved optimization (compiler flag -Qprof_use).
The option -Qprof_gen instruments a program for profiling to get the execution count of each basic block. It also creates a new static profile information file (.spi). This flag is used in phase 1 of the Profile Guided Optimizer (PGO) to instruct the compiler to produce code in your object files in preparation for instrumented execution.
The instrumented code
This option enables the use of profiling information during optimization as final step in Profile Guided Optimization.
Profile Guided Optimization (PGO) consists of 3 phases:
Phase 1: Compile and generate instrumented code in preparation
to gather profiling information (compiler flag -Qprof_gen).
Phase 2: Execute the instrumented code and gather profiling information.
Phase 3: Recompile the code and use the profiling information
for improved optimization (compiler flag -Qprof_use).
The option -Qprof_use instructs the compiler to use the profiling information from phase 2 of PGO in order to produce a profile-optimized executable (phase 3 of PGO).
It also enables function splitting (option -Qfnsplit) and function grouping during optimization.
Note that there is no way to turn off function grouping if you enable it using this option.
The recompilation with -Qprof_use
Maximizes speed across the entire program.
In Windows, it sets the following options:
-O3 -Qipo -Qprec-div- -QxT
Note that programs compiled with the -QxT option
will detect non-compatible processors and generate
an error message during execution.
The -QxT option that is set by the -fast option
cannot be overridden by other command line options.
If you specify -fast and a differnt processor-specific option,
such as -QxN, the compiler will issue a warning that explains
the -QxT option cannot be overridden.
Optimizes for speed. Enables high-level optimization. This level does not guarantee higher performance. Using this option may increase the compilation time. Impact on performance is application dependent, some applications may not see a performance improvement.
The optimizations include:
-Qprec-div improves precision of floating-point divides. It has a slight impact on speed. -Qprec-div- disables this option.
With some optimizations, -QxN and -QxB, the compiler may change floating-point division computations into multiplication by the reciprocal of the denominator. For example, A/B is computed as A * (1/B) to improve the speed of the computation.
However, sometimes the value produced by this transformation is not as accurate as full IEEE division. When it is important to have fully precise IEEE division, use this option to disable the floating-point division-to-multiplication optimization. The result is more accurate, with some loss of performance.
If you specify -Qprec-div-, it enables optimizations that give slightly less precise results than full IEEE division.
Default is -Qprec-div
-Qxprocessor This option directs the compiler to generate specialized and optimized code for the Intel processor that executes your program. It lets you target your program to run on a specific Intel processor.
processor Is the processor
for which you want to target your program.
Here: T Code is optimized
generating SSSE3, SSE3, SSE2, and SSE instructions for Intel processors.
Code can be optimized for the Intel Core 2 Duo processor family.
The resulting code may contain unconditional use of features
that are not supported on other processors.
This option also enables new optimizations in addition to Intel
processor-specific optimizations including advanced data layout and code
restructuring optimizations to improve memory accesses for Intel processors.
Programs compiled with -QxT will display a fatal run-time error if they are executed on unsupported processors.
-Qipo[n]
This option enables interprocedural optimizations between files. This is also called multifile interprocedural optimization (multifile IPO) or Whole Program Optimization (WPO).
When you specify this option, the compiler performs inline function expansion for calls to functions defined in separate files.
You cannot specify the names for the object files that are created.
n Is an optional integer that specifies
the number of object files the compiler should create.
The integer must be greater than or equal to 0.
If you do not specify n, the default is 0.
If n is 0, the compiler decides whether to create one or more object files based on an estimate of the size of the application. It generates one object file for small applications, and two or more object files for large applications.
If n is greater than 0, the compiler generates n object files, unless n exceeds the number of source files (m), in which case the compiler generates only m object files.
Optimizes for speed.
The -O2 option includes the following options:
This options defaults to ON.
This option also enables:
Enables single-file interprocedural optimizations within a file.
This option tells the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel.
To use this option, you must also specify -O2 or -O3.
Enables cache/bandwidth optimization for stores under conditionals (within vector loops). This option tells the compiler to perform a conditional check in a vectorized loop. This checking avoids unnecessary stores and may improve performance by conserving bandwidth.
Enable compiler to generate runtime control code for effective automatic parallelization. This option generates code to perform run-time checks for loops that have symbolic loop bounds. If the granularity of a loop is greater than the parallelization threshold, the loop will be executed in parallel. If you do not specify this option, the compiler may not parallelize loops with symbolic loop bounds if the compile-time granularity estimation of a loop can not ensure it is beneficial to parallelize the loop.
Enable/disable(DEFAULT) use of ANSI aliasing rules in optimizations; user asserts that the program adheres to these rules.
Enables function splitting.
This option enables function splitting if -Qprof-use is also specified. Otherwise, this option has no effect.
It is enabled automatically if you specify -Qprof-use. If you do not specify one of those options, the default is -Qfnsplit-, which disables function splitting but leaves function grouping enabled.
To disable function splitting when you use -Qprof-use, specify -Qfnsplit-.
Select the method that the register allocator uses to partition each routine into regions
Multi-versioning is used for generating different versions of the loop based on run time dependence testing, alignment and checking for short/long trip counts. If this option is turned on, it will trigger more versioning at the expense of creating more overhead to check for pointer aliasing and scalar replacement.
Specifies whether streaming stores are generated:
Enables global optimizations.
Enables/disables inline expansion of intrinsic functions.
Default enabled
This option enables most speed optimizations, but disables some that increase code size for a small speed benefit.
Default enabled
Enables [disables] the use of the EBP register in optimizations. When you disable with -Oy-, the EBP register is used as frame pointer. -Oy has the effect of reducing the number of general-purpose registers by 1, and can produce slightly less efficient code.
Default enabled
n = 0
Disables inlining of user-defined functions.
However, statement functions are always inlined
n = 1
Enables inlining of functions declared with the __inline keyword.
Also enables inlining according to the C++ language
n = 2
Enables inlining of any function.
However, the compiler decides which functions are inlined.
This option enables interprocedural optimizations and has the same
effect as specifying option Qip.
Default enabled with n = 2
This option enables read-only string-pooling optimization.
Disables stack-checking for routines with n or more bytes of local variables and compiler temporaries.
Default enabled with n = 4096.
Assume [not assume] no aliasing
Default disabled
Enables all speed optimizations.
Overrides -Os
Assume[not assume] no cross function aliasing.
Enables string-pooling optimization.
Packages functions to enable linker optimization.
Default enabled
Generates specialized code for processor specific codes K, W, N, P while also generating generic IA-32 code.
Enables[disables] fast conversions of floating-point to integer conversions. This option does not guarantee that any particular rounding mode will be used.
for C and C++
If your program satisfies the above conditions, setting the -Qansi_alias flag will help the compiler better optimize the program. However, if your program does not satisfy one of the above conditions, the -Qansi_alias flag may lead the compiler to generate incorrect code.
for Fortran
Enables (default) or disables the compiler to assume that the program adheres to the ANSI Fortran type aliasablility rules.
For example, an object of type real cannot be accessed as an integer.
You should see the ANSI Standard for the complete set of rules.
round fp results at assignments & casts (some speed impact)
This option flushes denormal results to zero when the application is in the gradual underflow mode. It may improve performance if the denormal values are not critical to your application's behavior.
This option only has an effect when the main program is being compiled. It sets the ftz mode for the process.
This option enables prefetch insertion optimization. The goal of prefetching is to reduce cache misses by providing hints to the processor about when data should be loaded into the cache.
Default is -Qprefetch- which disables this kind of optimization.
-Qunrolln tells the compiler the maximum number
of times to unroll loops.
If n is not specified, the optimizer determines
how many times loops can be unrolled.
If n is 0, loop unrolling is disabled.
Enables more aggressive unrolling heuristics
This option places local variables, except those declared as SAVE, to the run-time stack. It is as if the variables were declared with the AUTOMATIC attribute.
It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute, or variables that appear in an EQUIVALENCE statement or in a common block.
This option may provide a performance gain for your program, but if your program depends on variables having the same value as the last time the routine was invoked, your program may not function properly.
If you want to cause variables to be placed in static memory, specify /Qsave (Windows).
Specifies the strictest alignment constraint for structure and union types as 1, 2. 4. 8 or 16 bytes
Default is 16.
Problem: 16 is also possible. How to write regexp?
Enables the compiler to use SSE instructions.
Enables the compiler to use SSE2 instructions.
Enables floating-point significand precision control. The value is used to round the significand to the correct number of bits. The value must be either 32, 64 or 80.
Default enabled
Determines whether local variables are put on the run-time stack.
Enables[disables] scalar replacement performed during loop transformations.
(requires /O3).
This option enables standard C++ features without disabling Microsoft features within the bounds of what is provided in the Microsoft headers and libraries.
This option has the same effect as specifying -GX -GR.
-GX Enables C++ exception handling.
-GR Enables C++ Run Time Type Information (RTTI).
Specifies the stack reserve amount for the program.
-F<n>
<n> is the stack reserve amount.
It can be specified as a decimal integer or by using a C-style convention
for constants (for example, -F0x1000).
Default: The stack size default is chosen by the operating system.
Force Linking even if multiple entry names are found.
Link with MicroQuill SmartHeap Library.
Available from
http://www.microquill.com/
Link with MicroQuill SmartHeap Library (64-bit version).
Available from
http://www.microquill.com/
The use of -Qparallel to generate auto-parallelized code requires support libraries that are dynamically linked by default. Specifying libguide40.lib on the link line, statically links in libguide40.lib to allow auto-parallelized binaries to work on systems which do not have the dynamic version of this library installed.
HEADER for PORTABILITY
-TP tells the compiler to process all source or unrecognized file types
as C++ source files.
Default: The compiler assumes that files with the extension .c or .C
are C source files.
To handle them as C++ source files, the compiler flag -TP is needed.
-Qlowercase causes the compiler to ignore case differences in identifiers
and to convert external names to lowercase.
It is needed to specify the naming convention for mixing C and Fortran codes.
-assume:[no]underscore
Determines whether the compiler appends an underscore character
to external user-defined names.
-assume:underscore is needed to specify the naming convention
for mixing C and Fortran codes.
Unset a buggy Macro in the current version of Microsoft Visual Studio 2005.
-Qoption,string,options This option passes options to a specified tool.
string Is the name of the tool.
Here: cpp indicates the C++ preprocessor.
options Are one or more comma-separated,
valid options for the designated tool.
Here: --no_wchar_t_keyword is passed to C++ preprocessor to provide
the information that there is no wchar_t keyword.
This flag must be used with Microsoft Visual Studio 2005.
It avoids syntax errors coming from the use of wchar_t in 483.xalancbmk.
HEADER for COMPILER
Invoke Intel C/C++ compiler.
Also used to invoke linker for C/C++ programs.
Invoke Intel Fortran compiler.
Also used to invoke linker for Fortran programs
and C/Fortran mixtures.
This option enables/disables C99 support for C programs.
Specifies compatibilty with Microsoft Visual Studio .NET 2003.
Specifies compatibilty with Microsoft Visual Studio 2005.