AIX 5L/6 with IBM XL Compilers SPEC CPU Flags

Optimization Flags

-O5
- -O4
  - -O3
    - -O2
      - -O
    - -qhot=level=0
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
- -qipa=level=2
- -qarch=auto
- -qtune=auto
-O4
- -O3
  - -O2
    - -O
  - -qhot=level=0
- -qipa=level=1
- -qarch=auto
- -qtune=auto
-O3
- -O2
  - -O
- -qhot=level=0
-O2
- -O
  - -O2
-O
- -O2
  - -O
-qarch
-qtune
-qinlglue
-qhot
-qipa=level
-qpdf1
-qpdf2
-qfdpr
-qxlf90=nosignedzero
-q64
-bmaxdata:0x80000000
-bdatapsize:64K
-bstackpsize:64K
-btextpsize:64K
-blpdata
-qlargepage
-qalloca
-qsmallstack=dynlenonheap
-qenablevmx
-qvecnvol
-D__IBM_FAST_VECTOR
-D_ILS_MACROS
-qrtti=all
-qalias=noansi
-qalign
-qstrict

- -O5
- -O5\b
- Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
  
  -O5 is equivalent to the following flags
  - -O4
  - -qipa=level=2
  - -qarch=auto
  - -qtune=auto
- Includes:
  - -O4
    - -O3
      
      -O2
      
      -O
      
      -qhot=level=0
    - -qipa=level=1
    - -qarch=auto
    - -qtune=auto
  - -qipa=level=2
  - -qarch=auto
  - -qtune=auto
- -O4
- -O4\b
- Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.
  
  -O4 is equivalent to the following flags
  - -O3
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
- Includes:
  - -O3
    - -O2
      
      -O
    - -qhot=level=0
  - -qipa=level=1
  - -qarch=auto
  - -qtune=auto
- -O3
- -O3\b
- -O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources. The optimizations provided include:
  - In-depth memory access analysis
  - Better loop scheduling
  - High-order loop analysis and transformations (-qhot=level=0)
  - Inlining of small procedures within a compilation unit by default
  - Eliminating implicit compile-time memory usage limits
  - Widening, which merges adjacent load/stores and other operations
  - Pointer aliasing improvements to enhance other optimizations
  -O3 is equivalent to the following flags
  - -O2
  - -qhot=level=0
- Includes:
  - -O2
    - -O
      
      -O2
  - -qhot=level=0
- -O2
- -O2\b
- -O2 Performs a set of optimizations that are intended to offer improved performance without an unreasonable increase in time or storage that is required for compilation including:
  - Eliminates redundant code
  - Basic loop optimization
  - Can structure code to take advantage of -qarch and -qtune settings
- Includes:
  - -O
    - -O2
      
      -O
- -O
- -O\b
- -O enables the level of optimization that represents the best tradeoff between compilation speed and run-time performance. If you need a specific level of optimization, specify the appropriate numeric value. Currently, -O is equivalent to -O2.
- Includes:
  - -O2
    - -O
      
      -O2
- -qarch
- -qarch=(\S+)\b
- Produces object code containg instructins that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.
  
  Supported values for this flag are
  - auto
  - pwr6e
  - pwr6
  - pwr5x
  - pwr5
  - pwr4
  - ppc970
- -qtune
- -qtune=(\S+)\b
- Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:
  - auto
  - pwr6e
  - pwr6
  - pwr5x
  - pwr5
  - pwr4
  - ppc970
- -qinlglue
- -qinlglue\b
- This option inlines glue code that optimizes external function calls when compiling.
- -qhot
- -qhot=level=[012]\b
- Performs high-order transformations on loops during optimization.
- -qipa=level
- -qipa=level=[012]\b
- Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.
  
  level=0 Does only minimal interprocedural analysis and optimization
  
  level=1 turns on inlining , limited alias analysis, and limited call-site tailoring
  
  level=2 turns on full interprocedural data flow and alias analysis
- -qpdf1
- -qpdf1\b
- The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
- -qpdf2
- -qpdf2\b
- The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
- -qfdpr
- -qfdpr\b
- The compiler generates additional symbol information for use by the AIX "fdpr" binary optimization tool.

-qxlf90=nosignedzero
-qxlf90=nosignedzero\b

         -qxlf90=
                Determines whether the compiler provides the
                Fortran 90 or the Fortran 95 level of support for
                certain aspects of the language.  can be
                one of the following:

                signedzero | nosignedzero
                     Determines how the SIGN(A,B) function handles
                     signed real 0.0. In addition, determines
                     whether negative internal values will be
                     prefixed with a minus when formatted output
                     would produce a negative sign zero.
                autodealloc | noautodealloc
                     Determines whether the compiler deallocates
                     allocatable arrays that are declared locally
                     without either the SAVE or the STATIC
                     attribute and have a status of currently
                     allocated when the subprogram terminates.
                oldpad | nooldpad
                     When the PAD=specifier is present in the
                     INQUIRE statement, specifying -qxlf90=nooldpad
                     returns UNDEFINED when there is no connection,
                     or when the connection is for unformatted I/O.
                     This behavior conforms with the Fortran 95
                     standard and above. Specifying -qxlf90=oldpad
                     preserves the Fortran 90 behavior.

                Default:
                     o signedzero, autodealloc and nooldpad for the
                     xlf95, xlf95_r, xlf95_r7 and f95 invocation
                     commands.
                     o nosignedzero, noautodealloc and oldpad for
                     all other invocation commands.

- -q64
- -q64\b
- Generates 64 bit ABI binaries. The default is to generate 32 bit ABI binaries.
- -bmaxdata:0x80000000
- -bmaxdata:(\S+)\b
- Causes the system loader to put the heap in its own segment of the size specified. This is only required for 32-bit applications, as their segments are 256M. If the last digit of the value is "C", then it also turns off the malloc pool option for that executable.
- -bdatapsize:64K
- -bdatapsize:64K\b
- Specifies a page size of 64K for the program data segment.
- -bstackpsize:64K
- -bstackpsize:64K\b
- Specifies a page size of 64K for the program stack segment.
- -btextpsize:64K
- -btextpsize:64K\b
- Specifies a page size of 64K for the program text segment.
- -blpdata
- -blpdata\b
- Sets the bit in the file's XCOFF header indicating that this executable will request the use of large pages when they are available on the system and when the user has an appropriate privilege
- -qlargepage
- -qlargepage\b
- Indicates that a program, designed to execute in a large page memory environment, can take advantage of large 16 MB pages provided on POWER4 and higher based systems.
- -qalloca
- -qalloca\b
- Indicates that the compiler understands how to do alloca().
- -qsmallstack=dynlenonheap
- -qsmallstack=dynlenonheap\b
- Causes the Fortran compiler to allocate dynamic arrays on the heap instead of the stack
- -qenablevmx
- -qenablevmx\b
- Enables the generation of vector instructions for processors that support them.
- -qvecnvol
- -qvecnvol\b
- Specifies whether to use volatile or non-volatile vector registers. Volatile vector registers are registers whose value is not preserved across function calls so the compiler will not depend on values in them across function calls.
- -D__IBM_FAST_VECTOR
- -D__IBM_FAST_VECTOR\b
- The __IBM_FAST_VECTOR macro defines a different iterator for the std::vector template class. This iterator results in faster code, but is not compatible with code using the default iterator for a std::vector template class. All uses of std::vector for a data type must use the same iterator. Add -D__IBM_FAST_VECTOR to the compile line, or "#define __IBM_FAST_VECTOR 1" to your source code to use the faster iterator for std::vector template class. You must compile all sources with this macro.
- -D_ILS_MACROS
- -D_ILS_MACROS\b
- Causes AIX to define "ischar()" (and friends) as macros and not as subroutines.
- -qrtti=all
- -qrtti=all\b
- Cause the C++ compiler to generate Run Time Type Identification code

-qalias=noansi
-qalias=(noansi|nostd)\b

 qalias=ansi | noansi
   If ansi is specified, type-based aliasing is
   used during optimization, which restricts the
   lvalues that can be safely used to access a
   data object. The default is ansi for the xlc,
   xlC, and c89 commands. This option has no
   effect unless you also specify the -O option.

 qalias=std |nostd
   Indicates whether the compilation units contain
   any non-standard aliasing (see Compiler Reference
   for more information). If so, specify nostd.

-qalign
-qalign=natural\b

           Specifies what aggregate alignment rules the
                compiler uses for file compilation, where the
                alignment options are:

                bit_packed
                     The compiler uses the bit_packed alignment
                     rules.
                full
                     The compiler uses the RISC System/6000
                     alignment rules. This is the same as power.
                mac68k
                     The compiler uses the Macintosh alignment
                     rules.  This suboption is valid only for 32-
                     bit compilations.
                natural
                     The compiler maps structure members to their
                     natural boundaries.
                packed
                     The compiler uses the packed alignment rules.
                power
                     The compiler uses the RISC System/6000
                     alignment rules.
                twobyte
                     The compiler uses the Macintosh alignment
                     rules.  This suboption is valid only for 32-
                     bit compilations.  The mac68k option is the
                     same as twobyte.

                The default is -qalign=full.

-qstrict
-q(no)?strict\b

                Turns off aggressive optimizations which have the
                potential to alter the semantics of your program.
                -qstrict sets -qfloat=nofltint:norsqrt. -qnostrict
                sets -qfloat=rsqrt. This option is only valid with
                -O2 or higher optimization levels.

                Default:
                     o -qnostrict at -O3 or higher.
                     o -qstrict otherwise.

AIX 5L/6 with IBM XL Compilers SPEC CPU Flags

Sections

Optimization Flags

Portability Flags

Compiler Flags

Other Flags

System and Other Tuning Information