Note: The GNU Compiler Collection provides a wide array of compiler options, which are described in detail in the documentation at http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gcc/Option-Index.html#Option-Index and http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gfortran/. The descriptions contained in this SPEC flags file are not intended to replace those documents.
The experienced reader of SPEC flags files may notice that this file is somewhat less detailed than usual. The GCC documentation is (a) very detailed; (b) readily available; and (c) restricted as to requirements if re-formatted, modified, or re-packaged, per the terms of the GNU Free Documentation License. Out of an abundance of caution, and an abundance of respect, the author of this flags file prefers to provide briefer summaries here in this flags file, while providing handy links to the more detailed versions.
Selecting one of the following will take you directly to that section:
Enables a range of optimizations that provide faster, though sometimes less precise, mathematical operations.
More details are available.
Enables prefetching of arrays used in loops.
More details are available.
Instruments code to collect information for profile-driven feedback.
Information is collected regarding both code paths and data values.
More details are available.
Applies information from a profile run in order to improve optimization.
Several optimizations are improved when profile data is available, including branch probabilities, loop peeling, and loop
unrolling.
More details are available.
Attempts to decompose loops in order to run them on multiple processors.
More details are available.
Tells the optimizer to unroll all loops.
More details are available.
Compiles for a 64-bit (LP64) data model.
More details are available.
Allows use of instructions that require the Intel Core2 architecture.
More details are available.
Tunes code based on the Intel Core2 architecture.
More details are available.
Allows use of instructions that require SSE 4.2 hardware.
More details are available.
Increases optimization levels: the higher the number, the more optimization is done. Higher levels of optimization may
require additional compilation time, in the hopes of reducing execution time. At -O, basic optimizations are performed,
such as constant merging and elimination of dead code. At -O2, additional optimizations are added, such as common
subexpression elimination and strict aliasing. At -O3, even more optimizations are performed, such as function inlining and
vectorization.
More details are available.
Add the linker flag that requests use of big (2 MB) pages at run time.
Invokes the GNU C++ compiler
More details are available.
Invokes the GNU C compiler
More details are available.
Invokes the GNU Fortran compiler
More details are available.
Allows source code in traditional (fixed-column) Fortran layout.
More details are available.
Enables warnings.
More details are available.
numactl
It is advantageous to bind a process to a particular core. Otherwise, the OS may arbitrarily move your
process from one core to another. To help, SPEC allows the use of a "submit" config file option where users can specify a
utility to use to bind processes. This option is used with the Linux 'numactl' command to run processes with a specific
NUMA scheduling or memory placement policy. The policy is set for a command and inherited by all of its children.
Large Pages
Large pages were created using this recipe, which is from the hugetlb HOWTO:
sysctl vm.nr_hugepages=512 HUGETLB_MORECORE=yes export LD_PRELOAD=/usr/lib64/libhugetlbfs.so
Setting the sysctl vm.nr_hugepages specifies how many large pages should be reserved. As described in man libhugetlbfs, the environment variables cause large pages to be allocated for application memory.