Siemens Nixdorf pyrC compiler flags (as of November 1997)
=========================================================
The following is a list of short explanations of compiler/linker flags
used for SPEC CINT95 result submissions for Siemens Nixdorf / Pyramid
RM systems, using the pyrC 6.0 compiler.
This flag description supersedes the earlier description given for version 5.0
of the same compiler (some flags are new, and the new description covers
more flags).
It is likely that future result submissions, if they use new compilers
or new compiler versions, will have different flags; then this flag
description will be superseded by a new one.
---------------------------------------------------------------------
1. Compiler Flags
[Syntax note: For most flags that have a numeric parameter
(e.g., inlining control), this parameter can be separated from
the flag by either a comma "," or a colon ":".]
-qfeedback
Standard (1-pass) feedback optimization:
Produce code that collects call graph and flow graph
information suitable for feedback directed optimization.
-WM,-profdir
Specifies that profiling information should be written to
and read from the directory . Default is ./PROF.
-qfeedback2
Additional (2-pass) feedback optimization:
Produce code that collects information from an executable
optimized in a first pass of feedback optimization
(i.e. one compiled with -qfeedback / -WM,-U, or -WM,-O4,
or -WM,-O5, or -WM,-Omips4).
-WM,-profdir2
Specifies that profiling information from 2-pass feedback
compilation should be written to
and read from the directory . Default is ./PROF2.
-WM,-use_fb2
Specifies that profiling information from 2-pass feedback
compilation should be used in the generation of the (final)
executable. Must be used together with -WM,-O4, or
-WM,-O5, or -WM,-Omips4.
-WM,-Omips4
Performs all safe and generally applicable optimizations
including interprocedural optimizations, register
allocation across function calls and feedback directed
optimizations (function inlining, procedure positioning,
branch elimination, procedure splitting, register
allocation and cross basic block scheduling). This flag
also directs the compiler to produce nonposition-
independent code, to generate code using the instruction
set of the MIPS4 ISA, to inline alloca, printf, memcpy,
memset, memcmp, and memmove and to use U-code system
libraries. These libraries represent the same system
services as their regular counterparts, but in a form
more suitable for interprocedural optimization.
The flag also includes -Wb,-fast_int_mul (see below).
-WM,-G
Specifies that data items smaller than bytes in
size should be placed in the global data area and accessed
using a faster addressing mode. Default is 0.
-WM,-pre_opt
Adds an additional phase of optimization that may find
additionl optimization possibilities at the expense of more
compile time.
-WM,-no_positioning
Disables procedure positioning feedback optimization.
-Wb,-br_likely_cntl,,
Controls the branch likely optimization which sets the
likely bit in a conditional branch. If feedback indicates
that a conditional branch is probably taken and the
branch cannot be reversed, the branch's likely bit is set
if both of the following criteria are met:
1) the branch is taken at least percent of the time and
2) equals 0 or the branch is taken at
times more often than the time the branch's function is
called.
Both and are expressed as percentages, defaults
are 90 and 0, respectively.
-Wb,-prefetch,,
This will insert prefetch instructions in loops if a
loop appears to access memory in a serial fashion. Only
loops which have at least iterations are
considered. is the expected latency for
fetches from memory in units of machine instruction
cycle times.
Off by default; Omips4 sets it "on" and sets the values to
40 and 400, respectively.
-Wb,-fast_int_mul
Directs the optimizer to to use the floating-point unit
to perform 32-bit integer multiplications wherever
doing so would result in correct, faster code. Because this
flag changes the behavior of multiplications that overflow,
programs that depend on the trunction to 32-bits of two-
complement multiplication (the default behavior) should not
use this flag.
Because the difference to the default behavior appears in
overflow cases only (not in legal C programs), and because rule
2.2.5 of the CPU95 Run Rules exempts numerical accuracy
flags from baseline restrictions anyway, this flag is not
an assertion flag in the sense of the CPU95 Run Rules.
-Wb,-no_self_copy
Asserts that the optimizer may assume that the difference
between any two pointers referencing the same data item is
greater than seven bytes.
-Wc,-xjp_mh_opt,,
Controls the hot switch optimization which uses conditional
branches instead of indirect jumps at C switch statements.
For a switch label to be considered for this optimization,
the label's relative frequency of execution must be greater
than num1 percent. The parameter num2 limits the maximum number
of conditional branches. -WM,-Omips4 sets the values to 3 and 10,
respectively.
-WG,-xxx / -Wg,-xxx / -Wn,-xxx
Flags that have one of these forms control either the
"inliner" pass of the compiler (-Wg,-xxx), or the "cloner"
pass of the compiler (-Wn,-xxx), or both (-WG,-xxx).
A setting with a more specific value (lower case letter g or n)
overrides the more general setting (uppercase letter G).
Although the following description uses the "-WG,-xxx" form,
it holds for the other forms also.
Some flags exist for the "cloner" only (the pass that
optimizes for specific call locations of subroutines), they
provide finer control over the cloning process. They can be
written in the form -WG,-xxx or -Wn,-xxx; the following
description uses the form -Wn,-xxx.
-WG,-inline_limit:
Sets a size threshold for inlining/cloning. A call will not be
inlined/cloned if the resulting function (after inlining/cloning)
exceeds basic blocks. Default is 500.
-WG,-space_time:
Tells the "umerge" phase to consider only those functions for
inlining/cloning whose estimated ratio of code expansion to time
savings is less than n. Default is 3.0.
-WG,-boc:
Tells the "umerge" phase to consider only those functions for
inlining /cloning whose estimated ratio of runtime cycle save
to I-cache cost of doing inlining/cloning is greater than or
equal to n. Default is 1.0.
-Wg,-dont_prune_zero_edges
Directs the inliner to inline function calls with zero
execution count in the feedback information. The default is
to not inline these edges.
-Wn,-clone_expansion:
Directs the cloner to limit the maximum relative growth of the
program to . The default for is 1.3.
-Wn,-recursion_depth:
Sets the maximum number of function calls through which the
cloner will search to identify recursive functions. For
example, -WN,-recursion_depth:1 means that functions who call
themselves will be consider recursive functions.
-Wn,-only_clone_recursion
Directs the cloner to only clone recursive functions.
-Wn,-recursion_limit:
Directs the cloner to limit the maximum number of basic blocks
in a recursive function to . If -Wn,-recursion_limit
isn't given, then this is set by the -WG,-inline_limit flag.
If neither of these flags is given, the default is 500.
-Wo,-loopunroll:
Tells the optimizer to unroll loops times.
Default is 4, Omips4 sets to 8.
-Wo,-unrolllimit:
is the limit on the number of instructions within a
loop unrolled by the optimizer.
Default is 500, Omips4 sets to 2000.
-Wo,-no_const_in_reg
Tells the optimizer not to put constants in registers.
-Wo,splitedges,
Controls the edge splitting algorithm in "uopt" which inserts
an empty basic block on infrequently executed control flow
edges to increase optimization opportunities. This
optimization uses feedback information to limit the number
of split edges and avoid excessive compilation time.
"uopt" will split an edge if its execution frequency
multiplied by num is less than the smaller of the execution
frequencies of the edge's head and tail basic blocks.
Setting num to zero disables edge splitting.
-Wo,-recursive_calls
Directs uopt to use different heuristics that result in better
performance if there are recursive function calls in the
source code. Only effective in -WM,-O3 and -WM,-O5 modes.
-KOlimit:
Changes the threshold size for optimizing very large
programs. The argument specifies the maximum size in
basic blocks of a function that will be optimized by the
global optimizer. The default value of the argument is
1000. The optimization phase of the compiler warns you if
this flag is needed to optimize a particular program. There
can be no space around the colon (:).
-WM,-Omips4 sets num to 3000.
2. Linker Flags
-dn This option is passed to ld. It specifies static linking
in the link editor.
3. Portability Flags:
-DI_TIME
-DI_SYS_TIME
Enables certain (SPEC-approved) source code parts via conditional
compilation.
Questions?
More details can be found in the compiler documentation. SPEC-specific
questions should be sent to the SPEC OSG representative
Reinhold Weicker, weicker.pad@sni.de