(To check for possible updates to this document, please see http://www.spec.org/cpu2000/docs/ )
OverviewClick one of the following in order to go to the detailed contents about that item: | |
I. | Introduction |
II. | Config file options for runspec |
III. | Config file options for specmake |
IV. | Config file options for the shell |
V. | Config file options for the reader |
VI. | Files output during a build |
VII. | Troubleshooting |
Note: links to SPEC CPU2000 documents on this web page
assume that you are reading the page from a directory that
also contains the other SPEC CPU2000 documents. If by
some chance you are reading this web page from a location
where the links do not work, try accessing the referenced
documents at one of the following locations:
|
A key decision that must be made by designers of a benchmark suite is whether to allow the benchmark source code to be changed when the suite is used.
If source code changes are allowed:
+ | The benchmark can be adapted to the system under test. |
+ | Portability may be easier. |
| But it may be hard to compare results between systems, unless some formal audit is done to ensure that comparable work is done. |
If source code changes are not allowed:
+ | Results may be easier to compare. |
| It may take more time and effort to develop the benchmark, because portability will have to be built in ahead of time. |
| Portability may be hard to achieve, at least for real applications. (Simple loops of 15 lines can port with little effort, and such benchmarks have their uses. But real applications are more complex.) |
SPEC has chosen not to allow source code changes for the CPU2000 suite, except under very limited circumstances described below. By restricting source code changes, we separate the activity of porting benchmarks (which has a goal of being performance neutral), from the activity of using the benchmarks (where the goal is not neutrality, it's Get The Best Score You Can.) Prior to the first use of CPU2000, SPEC therefore invested substantial effort to port the suite to as many platforms as practical. Testing was done on 7 different hardware architectures, 11 versions of Unix (including 4 versions of Linux) and 2 versions of NT.
Are source code changes ever allowed? Normally, no. But if you discover a reason why you believe such a change is required, SPEC wants to hear about it, and will consider such requests for a future revision of the suite. SPEC will normally not approve publication of CPU2000 results using modified source code, unless such modifications are unavoidable for the target environment, are submitted to SPEC, are made available to all users of the suite, and are formally approved by a vote.
So, if source code changes are not allowed, but the benchmarks must be compiled in a wide variety of environments, can the users at least write their own Makefiles, and select -D options to select different environments? The answer to these two questions are "no", and "yes", respectively:
The config file contains places where you can specify the characteristics of both your compile time and run time environments. It allows the advanced user to perform detailed manipulation of Makefile options, but retains all the changes in one place so that they can be examined and reproduced.
The config file is one of the key ingredients in making results reproducible. For example, if a customer would like to run the CPU2000 suite on her own SuperHero Model 4 and discover how close results are in her environment to the environment used when the vendor published a CPU2000 result, she should be able to do that using only 3 ingredients:
output_format = asc,ps tune = base reportable = 1 runlist = fpthen the defaults for the runspec command would change as specified. A user who types either of the following two commands would get precisely the same effect:
runspec --config=michael runspec --config=michael --output=asc,ps --tune=base --reportable fp
CC = cc CPORTABILITY = -DSPEC_CPU2000_LP64 OPTIMIZE = -O4are written to the Makefile set that is ultimately used to build the benchmark, and these lines are interpreted by specmake.
fdo_pre0 = mkdir /tmp/ahmad; rm -f /tmp/ahmad/*Runspec passes the above command to the shell prior to running a training run for feedback directed optimization, and the shell actually carries out the requested commands. It is therefore important to peruse a config file you are given before using it.
test_date = Nov-1999 hw_avail = Apr-1999 sw_avail = May-1999 notes015 = Note: Feedback directed optimization was not usedIn addition, the config file itself is available to the reader at http://www.spec.org/, The config file is presented in its entirety, with one exception described below under Comments and whitespace. The reason that the config file is made available is because it is so important to reproducing results, as described in the Introduction. The config file is saved on every run, as a compressed portion of the rawfile.
#> I didn't use the C++ beta test because of Bob's big back-end bug.Blank lines can be placed anywhere in a config file.
Spaces within a line are almost always ignored. Of course, you wouldn't want to spell OPTIMIZE as OPT I MIZE, but you are perfectly welcome to do either of the following:
OPTIMIZE=-O2 OPTIMIZE = -02The one place where spaces are considered significant is in notes, where the tools assume you are trying to line up your comments in the full disclosure reports.
Most attempts to address runspec itself must be done in the header section. For example, if you want to set reportable=1, you must do so before any occurrences of section markers.
The next several pages are devoted to sections that you name; the MD5 section is explained after that.
benchmark=tuning=extension=machine:These are referred to below as the 4 "section specifiers". The allowed values for the section specifiers are:
benchmark: | default
int fp Any individual benchmark, such as 252.eon |
tuning: | default
base peak |
extension: | default
any arbitrary string, such as "cloyce-genius" |
machine: | default (As of the date of this documentation, this feature is only partly implemented. Leave it at default unless you feel particularly courageous.) |
Trailing default sections may be omitted. Thus all of the following are equivalent:
252.eon=base=default=default: 252.eon=base=default: 252.eon=base:Section markers can be entered in any order. Section markers can be repeated; material from identical section markers will automatically be consolidated. That is, you are welcome to start one section, start a different one, then go back and add more material to the first section. But please note that since there is no marker for the header section, you cannot go back to it.
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base output_format = asc teeout = 1 default=default=default=default: OPTIMIZE = -O2 % runspec --config=tmp | grep swim.f f90 -c -o swim.o -O2 swim.f %The config file above is designed for quick, simple testing: it runs only one benchmark, namely 171.swim, using the smallest (test) workload, runs it only once, uses only base tuning, outputs only the text-format (ASCII) report, and displays the build commands to the screen. To use it, we issue a runspec command, and pipe the output to grep to search for the actual generated compile command. (Alternatively, on NT, we could use findstr on the generated log file). And, indeed, the tuning applied was the expected -O2.
Now consider this example, where a section marker is added that has the first specifier set to fp, for the floating point suite:
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base output_format = asc teeout = 1 default=default=default=default: OPTIMIZE = -O2 fp=default=default=default: OPTIMIZE = -O3 % runspec --config=tmp | grep swim.f f90 -c -o swim.o -O3 swim.f %The second OPTIMIZE command is used above because the reference to the floating point suite is considered to be more specific than the overall default.
Furthermore, we can add a specifier that mentions swim by name:
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base output_format = asc teeout = 1 default=default=default=default: OPTIMIZE = -O2 fp=default=default=default: OPTIMIZE = -O3 171.swim=default=default=default: OPTIMIZE = -O4 % runspec --config=tmp | grep swim.f f90 -c -o swim.o -O4 swim.f %The third OPTIMIZE command wins above, because it is included in the section that is considered to be the most specific. But what if we had said these in a different order?
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base output_format = asc teeout = 1 171.swim=default=default=default: OPTIMIZE = -O4 default=default=default=default: OPTIMIZE = -O2 fp=default=default=default: OPTIMIZE = -O3 % runspec --config=tmp | grep swim.f f90 -c -o swim.o -O4 swim.f %Notice above that the order of entry is not significant; it's the order of precedence from least specific to most specific.
Note: when a specifier is listed more than once at the same descriptive level, the last instance of the specifier is used. Consider this case:
171.swim=default=default=default: OPTIMIZE = -O4 171.swim=default=default=default: OPTIMIZE = -O3The ending value of OPTIMIZE for 171.swim is -O3, not -O4.
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base,peak output_format = asc teeout = 1 default=default=default=default: FC = f90 F77 = f77 default=base=default=default: FC = kf90 default=peak=default=default: F77 = kf77 % runspec --config=tmp | grep swim.f | grep swim.o kf90 -c -o swim.o swim.f kf77 -c -o swim.o swim.f %In the above example, we compile swim twice: once for base tuning, and once for peak. Notice that in both cases the compilers defined by the more specific section markers have been used, namely kf90 and kf77, rather than the f90 and f77 from default=default=default=default.
(The alert reader may ask here "Why does FC apply to base and F77 apply to peak?" The reason for this is that in base, the CPU2000 Run Rules allow only one Fortran compiler, which must be used for both the benchmarks that use the older Fortran-77 features, and for the benchmarks that use the newer Fortran-90 features. But for peak tuning your definition of FC is used for source files *.f90, and your definition of F77 is used for source files *.f.)
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = base output_format = asc teeout = 1 default=default=default=default: LIBS = -lm default=default=ev6=default: LIBS = -lm_ev6 % runspec --config=tmp | grep swim.o f90 -c -o swim.o swim.f f90 swim.o -lm -o swim % runspec --config=tmp --extension=ev6 | grep swim.o f90 -c -o swim.o swim.f f90 swim.o -lm_ev6 -o swim % % cd $SPEC/benchspec/CFP2000/171.swim/exe % ls -lt | head -3 total 3400 -rwxr-xr-x 1 john users 65536 Dec 10 02:47 swim_base.ev6 -rwxr-xr-x 1 john users 49152 Dec 10 02:47 swim_base.none %Notice above that two different versions of swim were built from the same config file, and both executables are present in the exe directory for swim.
% cat tmp.cfg runlist = swim size = test iterations = 1 tune = peak output_format = asc teeout = 1 default=default=default=default: OPTIMIZE = -O2 F77 = f77 LIBS = -lm 171.swim=default=default=default: OPTIMIZE = -O4 default=peak=default=default: F77 = kf77 default=default=ev6=default: LIBS = -lm_ev6 % runspec --config=tmp --extension=ev6 | grep swim.o kf77 -c -o swim.o -O4 swim.f kf77 -O4 swim.o -lm_ev6 -o swim %Notice above that all three sections applied: the section specifier for 171.swim, the specifier for peak tuning, and the specifier for extension ev6.
highest benchmark suite tuning lowest extensionAnd this order can be demonstrated as follows:
% cat tmp.cfg size = test iterations = 1 output_format = asc teeout = 1 default=default=default=default: OPTIMIZE = -O0 fp=default=default=default: OPTIMIZE = -O1 171.swim=default=default=default: OPTIMIZE = -O3 default=peak=default=default: OPTIMIZE = -O4 default=default=ev6=default: OPTIMIZE = -O5 % runspec --conf=tmp swim | grep swim.f [1] f90 -c -o swim.o -O3 swim.f % runspec --conf=tmp --tune=peak swim | grep swim.f [2] f77 -c -o swim.o -O3 swim.f % runspec --conf=tmp --extension=ev6 swim | grep swim.f [3] f90 -c -o swim.o -O3 swim.f % % runspec --conf=tmp --tune=base mgrid | grep mgrid.f [4] f90 -c -o mgrid.o -O1 mgrid.f % runspec --conf=tmp --tune=peak mgrid | grep mgrid.f [5] f77 -c -o mgrid.o -O1 mgrid.f % runspec --conf=tmp --extension=ev6 mgrid | grep mgrid.f [6] f90 -c -o mgrid.o -O1 mgrid.f % % runspec --conf=tmp --tune=base gzip | grep gzip.c [7] cc -c -o gzip.o -O0 gzip.c % runspec --conf=tmp --tune=peak gzip | grep gzip.c [8] cc -c -o gzip.o -O4 gzip.c % runspec --conf=tmp --extension=ev6 gzip | grep gzip.c [9] cc -c -o gzip.o -O5 gzip.c % runspec --conf=tmp --tune=peak --extension=ev6 gzip | grep gzip.c [10] cc -c -o gzip.o -O4 gzip.cNotice above that the named benchmark always wins: lines [1], [2], and [3]. If there is no section specifier that names a benchmark, but there is a section specifier that names a suite, then the suite wins: lines [4], [5], and [6]. If there are no applicable benchmark or suite specifiers, then tuning or extension can be applied: lines [8] and [9]. But if both tuning and extension are applied, tuning wins [10].
__MD5__ 168.wupwise=peak=nov14a=default: # Last updated Sun Nov 14 04:49:15 1999 optmd5=6cfebb629cf395d958755f68a6c9e05b exemd5=3d51c75d1c97918bf212c4fb47a30003 183.equake=peak=nov14a=default: # Last updated Sun Nov 14 04:49:22 1999 optmd5=f3f0d2022f2be4a9530c000b1e883008 exemd5=923a7bfd6f68ccf0d6e49261fa1c2030The "MD5" is a checksum that ensures that the binaries referenced in the config file are in fact built using the options described therein. For example, if you edit the config file to change the optimization level for 255.vortex, the next time the file is used the tools will notice the change and will recompile vortex.
You can optionally disable this behavior, but doing so is strongly discouraged. See the sarcastic remarks in the description of check_md5, below.
If you would like to see what portions of your config file are used in computing the MD5 hash, runspec with --debug=30 or higher, and examine the log file.
% cat tmp.cfg size = test iterations = 1 output_format = asc teeout = 1 expand_notes = 1 runlist = swim tune = base hw_avail = May-2000 notes001 = This run is from log.$lognum with hw_avail $hw_avail % runspec -c tmp | grep asc Identifying output formats...asc...config...html...pdf...ps...raw... format: ASCII -> /greg/result/CFP2000.101.asc % grep with /greg/result/CFP2000.101.asc This run is from log.101 with hw_avail May-2000As another example, let's say you want to go for minimal performance. You might want to do this with the 'nice' command. You can say:
submit=nice 20 '$command'and the $command gets expanded to whatever would normally be executed but with 'nice 20' stuck in front of it.
If you'd like a complete list of the variables that you can use in your commands (relative to the config file you're using), set runspec's verbosity to 80 or higher (-v 80) and either do a run that causes a command substitution to happen, or run with expand_notes set to 1.
To put text immediately after a variable, you need to make it possible for the parser to see the variable that you want, by using braces:
% tail -2 tmp.cfg notes001 =You have done ${lognum}x runs tonight, aren't you tired yet? % runspec -c tmp | grep asc Identifying output formats...asc...config...html...pdf...ps...raw... format: ASCII -> /john/result/CFP2000.103.asc % grep done /john/result/CFP2000.103.asc You have done 103x runs tonight, aren't you tired yet?
fdo_pre0 = mkdir /tmp/pb; rm -f /tmp/pb/${baseexe}*NOTICE in this example that although the commands are carried out by the shell, the variable substitution is done by runspec.
submit= echo "$command" > dobmk; prun -n 1 sh dobmk command_add_redirect=1In this example, the command that actually runs the benchmark is written to a small file, "dobmk", which is then submitted to a remote node selected by "prun". (The parallel run command, prun, can execute multiple copies of a process, but in this case we have requested just one copy by saying "-n 1". The SPEC tools will create as many copies as required.)
The "command_add_redirect" is crucial. What happens without it?
% head -1 $SPEC/config/tmp1.cfg submit= echo "$command" > dobmk; prun -n 1 sh dobmk % runspec --config=tmp1 --size=test -n 1 --rate --users=1 swim ... % cat $SPEC/benchspec/CFP2000/171.swim/run/00000006/dobmk ../00000006/swim_base.noneNow let's use command_add_redirect, and see how dobmk changes:
% diff $SPEC/config/tmp1.cfg $SPEC/config/tmp2.cfg 1a2 > command_add_redirect=1 % runspec --config=tmp2 --size=test -n 1 --rate --users=1 swim ... % cat $SPEC/benchspec/CFP2000/171.swim/run/00000006/dobmk ../00000006/swim_base.none < swim.in > swim.out 2> swim.errNotice that with command_add_redirect=1, the substitution for $command includes both the name of the executable and the file assignments for standard in, standard out, and standard error. This is needed because otherwise the files would not be connected to swim on the remote node. That is, the former generates [*]:
echo "swim_base.none" > dobmk; prun -n 1 sh dobmk < swim.in > swim.out 2> swim.errAnd the latter generates [*]:
echo "swim_base.none < swim.in > swim.out 2> swim.err" > dobmk; prun -n 1 sh dobmk
[*] | The picky reader may wish to know that the examples were editted for readability: a line wrap was added, and the directory string ../00000006/ was omitted. The advanced reader may wonder how the above lines were discovered: for the former, we found out what runspec would generate by going to the run directory and typing specinvoke -n, where -n means dry run; in the latter, we typed specinvoke -nr, where -r means that $command already has device redirection. For more information on specinvoke, see utility.html. |
% cat tmp.cfg size = test iterations = 1 output_format = asc teeout = 1 expand_notes = 1 runlist = swim tune = base,peak use_submit_for_speed = 1 default=peak=default=default: submit = echo "home=$HOME; spec=$SPEC;" >/tmp/chan; $command default=base=default=default: submit = echo "home=\$HOME; spec=\$SPEC;" > /tmp/nui; $command % runspec --config=tmp > /dev/null % cat /tmp/chan home=; spec=; % cat /tmp/nui home=/usr/users/chris; spec=/cpu2000; %Please note that, by default, 'submit' is only applied to rate runs. In this example, we used it for a speed run as well, by setting
use_submit_for_speed = 1in the configuration file.
submit= let "MYNUM=$SPECUSERNUM" ; let "NODE=\$MYNUM/2"; export NODE=/hw/nodenum/\$NODE; let "CPU=2*((\$MYNUM+1)/2)-\$MYNUM "; export CPU; /usr/sbin/dplace -place \$SPEC/submit.pf -mustrun $commandRunspec substitutes the current user number for $SPECUSERNUM, and then passes the above command to the shell which does the substitutions for CPU and NODE. This example originated on an SGI Origin 2000 system, where there are two CPUs per node. Suppose that runspec is about to submit the copy for user number 17. In that case:
submit= let "MYNUM=$SPECUSERNUM" ; # i.e. 17 let "NODE=\$MYNUM/2"; # So, NODE=8 export NODE=/hw/nodenum/\$NODE; # Now NODE=/hw/nodenum/8 let "CPU=2*((\$MYNUM+1)/2)-\$MYNUM "; # and CPU=1 export CPU; /usr/sbin/dplace # So we execute dplace -place \$SPEC/submit.pf # using $SPEC/submit.pf -mustrun $command # and the expected commandThe desired $command is run under the control of submit.pf, with $NODE=/hw/nodenum/8 and $CPU=1. Here are the contents of the placement file, submit.pf:
memories 1 in topology physical near $NODE threads 1 run thread 0 on memory 0 using cpu $CPUTwo important notes about the above submit command:
$SPEC/docs/example-advanced.cfg %SPEC%\docs.nt\example-advanced.cfgSearch that file for LIBS, and note the long comment which provides a walk-through of a complex substitution handled by specmake.
wrong: MYDIR = /usr/paul/compilers F90 = $(MYDIR)/f90 notes001 = compiler: $(F90)
$ cat tmp2.cfg expand_notes=1 foo = <<EOT This + is a + test + EOT bar = \ and +\ so +\ is +\ this+\ notes01 = $foo notes02 = $bar $ runspec --config=tmp2 --size=test --iterat=1 swim | grep asc Identifying output formats...asc...config...html...pdf...ps...raw... format: ASCII -> /cpu2000/result/CFP2000.024.asc $ grep + ../result/*24.asc This + is a + test + and + so + is + this+Note: although continued lines are supported, they are rarely used. The more common method of continuation is by appending a number to a field, as described in the section "Field scoping and continuation".
% cat tmp.cfg sw_compiler = GoFast C/C++ V4.2 sw_avail = Mar-2000 include: SUT.inc default=default=default=default: OPTIMIZE = -O4 % cat SUT.inc hw_model = SuperHero IV hw_avail = Feb-2000 % runspec --config=tmp --iterations=1 --size=test swimAfter the above command is completed, we end up with a report that mentions both hardware and software dates:
... Hardware availability: Feb-2000 ... Software availability: Mar-2000
All of these should be specified in your config file header section, unless the description states that the item can be applied to individual benchmarks.
option | default | Meaning |
action | validate | What to do. |
deletework | 0 | If set to 1, always delete existing benchmark working directories. An extra-careful person might want to set this to ensure no unwanted leftovers from previous benchmark runs, but the tools are already trying to enforce that property. |
ext | none | Extension for executables created |
ignore_errors | 0 | Ignore certain errors. Very useful when debugging a new compiler and new set of options. |
iterations | 3 | Number of iterations to run |
mach | default | Default machine id |
make_no_clobber | 0 | Don't delete directory when building executables. This option should only be used for troubleshooting a problematic compile. It is against the run rules to use it when building binaries for an actual submission. |
max_active_compares | # users | Max number of parallel compares. Useful when doing large SPECrate runs on a system with lots of CPUs. |
output_format | all | Format for reports. Valid options are asc (ASCII text), html, pdf, and ps. You might prefer to set this to asc if you're going to be doing lots of runs, and only create the pretty reports at the end of the series. See also runspec.html on --output_format and --rawformat. |
reportable | 0 | Strictly follow reporting rules. You must set reportable, in order to generate a valid run suitable for submission to SPEC. |
rate | 0 | Rate vs Speed (1=rate, 0=speed) |
rebuild | 0 | Rebuild binaries even if they exist |
runlist | none | What benchmarks to run |
setprocgroup | 1 | Set the process group. On Unix-like systems, improves the chances that ^C gets the whole run, not just one of the children |
size | ref | Size of input set. If you are in the early stages of testing a new compiler or new set of options, you might set this to test or train. |
table | 1 | In ASCII reports, include information about each execution of the benchmark. |
tune | base | default tuning level. In a reportable run, must be either all or base. |
unbuffer | 0 | Unbuffer STDOUT. For CPU95, there were occasional complaints about redundant output when a child process would flush its buffer. If similar problems occur for CPU2000, try setting this to 1. |
users | 1 | Number of users. This option can also be used for specific benchmarks - e.g. you could decide to run 64 copies of all benchmarks except gcc, which would run only 63. For base, the number of copies must be the same for all benchmarks, but for peak it is allowed to vary. |
verbose | 5 | Verbosity level. Select level 1 through 99 to control how much debugging info runspec prints out. For more information, see the section on the log file, below. |
All of these should be specified in your config file header section, unless the description states that the item can be applied to individual benchmarks.
option | default | Meaning |
backup_config | 1 | When updating the MD5 hashes in the config file, make a backup copy first. Highly recommended to defend against full-file-system errors, system crashes, or other unfortunate events. |
basepeak | 0 | Use base binary and/or base result for peak. If applied to the whole suite (in the header section), then only base is run, and its results are reported for both the base and peak metrics. If applied to a single benchmark, the same binary will be used for both base and peak runs, and the lower of the two medians will be reported for both. |
check_md5 | 1 | Runspec uses MD5 hashes to verify that executables match the config file that invokes them, and if they do not, runspec forces a recompile. You can turn that feature off by setting check_md5=0. WARNING: If you turn this feature off, you effectively say that you are willing to run a benchmark even if you don't know what you did or how you did it -- that is, you lack information as to how it was built! Since SPEC requires that you disclose how you built it, such a run would not be submittable to SPEC. |
command_add_redirect | 0 | If set, let the shell do I/O redirection. Otherwise, specinvoke will do it itself. |
difflines | 10 | Number of lines of differences to print when comparing results. |
env_vars | 0 | Allow environment to be overridden by ENV_* For example: env_vars=1 164.gzip=default=default=default: ENV_FOO=367 will cause the tools to spawn gzip (or anything related, like monitor_prebench) with FOO=367 |
expand_notes | 0 | If set, will expand variables in notes. This capability is limited because notes are NOT processed by specmake, so you cannot do repeated substitutions. |
feedback | 1 | Normally, feedback is controlled by the presence of one or more PASSn options (see the documentation of make variables). An additional control is provided by this config file option, which can be used to selectively turn feedback off for individual benchmarks. Note: in a base run, all benchmarks in a given language must use the same settings for feedback. |
ignore_sigint | 0 | Ignore SIGINT. If this is set, runspec will attempt to ignore you when you press ^C. (It is not clear why one might want to set this feature.) |
line_width | 0 | line wrap width for screen |
locking | 1 | Try to lock files |
log_line_width | 0 | line wrap width for logfiles. If your editor complains about lines being too long when you look at logfiles, try setting this to some reasonable value, such as 80 or 132. |
make | specmake | Name of make executable. Note that the run rules require use of specmake for reportable results. |
makeflags | '' | Extra flags for make (such as -j). Use only if you are familiar with gnu make. |
mean_anyway | 0 | Calculate mean even if invalid. DANGER this will write a mean to all reports even if no valid mean can be computed (e.g. half the benchmarks failed!) DANGER |
minimize_rundirs | 0 | Try to keep working disk size down. Cannot be used in a reportable run. |
minimize_builddirs | 0 | Try to keep working disk size down during builds |
srcalt | '' | Name of subdirectory under
|
teeout | 0 | Run output [from build] through tee so you can see it on the screen |
use_submit_for_speed | 0 | If set, use submit commands for speed runs as well as rate runs. Handy for running the benchmarks on a simulator, etc. |
Here are the commonly used variables:
CC | How to invoke your C Compiler |
CXX | How to invoke your C++ Compiler |
FC | How to invoke Fortran (the one that handles both .f and .f90) |
F77 | How to invoke your Fortran-77 compiler (for .f files only) |
CLD | How to invoke the Linker when compiling C programs |
CXXLD | How to invoke the Linker when compiling C++ programs |
FLD | How to invoke the Linker when compiling Fortran programs |
F77LD | How to invoke the Linker when compiling Fortran-77 programs |
ONESTEP | If set, build from sources directly to final binary. See the discussion in rule 2.2.6.13 of runrules.html |
OPTIMIZE | Optimization flags to be applied for all compilers |
COPTIMIZE | Optimization flags to be applied when using your C compiler |
CXXOPTIMIZE | Ditto, for C++ |
FOPTIMIZE | Ditto, for Fortran |
F77OPTIMIZE | Ditto, for F77 |
PORTABILITY | Portability flags to be applied no matter what the compiler |
CPORTABILITY | Portability flags to be applied when using your C compiler |
CXXPORTABILITY | Ditto, for C++ |
FPORTABILITY | Ditto, for Fortran |
F77PORTABILITY | Ditto, for Fortran-77 |
RM_SOURCES | Remove a source file. Should only be used for library substitutions that comply with run rule 2.1.2 |
PASSn_CFLAGS | Flags for pass "n" C compilation when doing profile- directed feedback. Typically n is either 1 or 2, for the compile done before the training run and the compile done after the training run. Search for the word "feedback" in runrules.html for more info; especially rule 2.2.3. See the feedback examples in example-medium.cfg and example-advanced.cfg from $SPEC/docs/ or %SPEC%\docs.nt\ . |
PASSn_CXXFLAGS | Ditto, for C++ |
PASSn_FFLAGS | Ditto, for Fortran |
PASSn_F77FLAGS | Ditto, for Fortran-77 |
fdo_pre0 | commands to be executed before starting a feedback directed compilation series |
fdo_preN | commands to be executed before pass N |
fdo_make_cleanN | commands to be executed for cleanup at pass N |
fdo_pre_makeN | commands to be done prior to Nth compile |
fdo_make_passN | commands to actually do the Nth compile |
fdo_post_makeN | commands to be done after the Nth compile |
fdo_runN | commands to be used for Nth training run |
fdo_postN | commands to be done at the end of pass N |
submit | commands to be used for distributing jobs across a multiprocessor system. See the detailed example in the section on Variable Substitution, above. |
company_name | The company performing the tests. Will not be printed in reports unless it differs from the field hw_vendor. |
hw_avail | Date hardware first shipped. If more than one date applies, use the LATEST one. |
hw_cpu | CPU type |
hw_cpu_mhz | Speed of the CPUs, in MHz |
hw_disk | Disk subsystem |
hw_fpu | Floating point unit |
hw_memory | Size of main memory |
hw_model | Model name |
hw_ncpu | Number of CPUs configured. Note that if your system has the ability to turn CPUs off, for example through a firmware setting, then it is acceptable to report "1" here if only 1 CPU was enabled on an SMP system. But beware -- you need to ensure that your method is effective and that you are not silently getting help from the allegedly turned-off CPUs. |
hw_ncpuorder | Valid number of CPUs orderable for this model. For example, "2 to 16". |
hw_ocache | 4th level or other form of cache |
hw_other | Any other performance-relevant hardware |
hw_parallel | Were multiple CPUs employed by a parallelizing compiler? Note that a speed run that uses a parallelizing compiler causes a single instance of a benchmark to run using multiple CPUs; this is different from a rate run, which typically distributes N instances over N CPUs. |
hw_pcache | 1st level (primary) cache |
hw_scache | 2nd level cache |
hw_tcache | 3rd level cache |
hw_vendor | Name of manufacturer for hardware |
license_num | Your SPEC license number |
machine_name | Machine name: not currently used, leave blank |
prepared_by | Is never output. If you wish, you could set this to your own name, so that the rawfile will be tagged with your name but not the formal reports. |
sw_avail | Availability date for the software used. If more than one date, use the LATEST one. |
sw_compiler | Name and version of compiler |
sw_file | File system (nfs, ufs, etc) |
sw_os | Operating system name and version |
sw_state | Multi-user, single-user, default, etc |
test_date | When you ran the tests |
tester_name | Your employer. Printed in reports. |
default=default=default=default: sw_compiler = Compaq C X6.2-259-449AT CC = cc -v int=default=default=default: sw_compiler2 = DIGITAL C++ V6.1-029-408B6 CXX = cxx -v fp=default=default=default: sw_compiler2 = Compaq Fortran V5.3 sw_compiler3 = KAP Fortran V4.2 FC = kf90 -vNotice in the above example that the information about the C compiler will be printed for both integer and floating point runs. The information about the C++ compiler will be printed only for integer runs; the information about Fortran compilers will be printed only for floating point runs.
% cat tmp.cfg size = test iterations = 1 output_format = asc teeout = 1 runlist = swim tune = base notes01_1 = ++ how notes02 = ++ you? notes01_2 = ++ are notes01 = ++ Alex, notes000 = ++ hi default=base=default=default: % runspec --config=tmp > /nev/dull % ls -t ../result/*asc | head -1 ../result/CFP2000.111.asc % grep ++ ../result/CFP2000.111.asc ++ hi ++ Alex, ++ how ++ are ++ you? %You can also use notes to describe software or hardware information with more detail beyond the predefined fields. For an example of where this might be useful, see $SPEC/docs/example-medium.cfg and search for "patch".
% ls /home/jim/CPU2000/config/tmp.c* tmp.cfg tmp.cfg.19991210aq tmp.cfg.19991210j tmp.cfg.19991210 tmp.cfg.19991210ar tmp.cfg.19991210k tmp.cfg.19991210a tmp.cfg.19991210as tmp.cfg.19991210l tmp.cfg.19991210aa tmp.cfg.19991210at tmp.cfg.19991210m tmp.cfg.19991210ab tmp.cfg.19991210au tmp.cfg.19991210n tmp.cfg.19991210ac tmp.cfg.19991210av tmp.cfg.19991210o tmp.cfg.19991210ad tmp.cfg.19991210aw tmp.cfg.19991210p tmp.cfg.19991210ae tmp.cfg.19991210ax tmp.cfg.19991210q tmp.cfg.19991210af tmp.cfg.19991210ay tmp.cfg.19991210r tmp.cfg.19991210ag tmp.cfg.19991210az tmp.cfg.19991210s tmp.cfg.19991210ah tmp.cfg.19991210b tmp.cfg.19991210t tmp.cfg.19991210ai tmp.cfg.19991210ba tmp.cfg.19991210u tmp.cfg.19991210aj tmp.cfg.19991210c tmp.cfg.19991210v tmp.cfg.19991210ak tmp.cfg.19991210d tmp.cfg.19991210w tmp.cfg.19991210al tmp.cfg.19991210e tmp.cfg.19991210x tmp.cfg.19991210am tmp.cfg.19991210f tmp.cfg.19991210y tmp.cfg.19991210an tmp.cfg.19991210g tmp.cfg.19991210z tmp.cfg.19991210ao tmp.cfg.19991210h tmp.cfg.19991210ap tmp.cfg.19991210iIf this feels like too much clutter, you can disable the backup mechanism, as described under backup_config. Note that doing so may leave you with a risk of losing the config file in case of a filesystem overflow or system crash. A better idea may be to periodically remove the clutter, for example by typing:
rm *.cfg.199912*
The CPU2000 tool suite provides for varying amounts of output about its actions during a run. These levels range from the bare minimum of output (level 0) to copious streams of information almost certainly worthless to anyone not developing the tools themselves (level 99). Note: selecting one output level gives you the output from all lower levels, which may cause you to wade through more output than you might like.
The 'level' referred to in the table below is selected either in the config file verbose option or in the runspec command as in 'runspec --verbose n'.
Levels higher than 99 are special; they are always output to your log file. You can also see them on the screen if you set verbosity to the specified level minus 100. For example, the default log level is 3. This means that on your screen you will get messages at levels 0 through 3, and 100 through 103. Additionally, in your log file, you'll find messages at levels 104 through 199.
Level | What you get |
0 | Basic status information, and most errors. These messages can not be turned off. |
1 | List of the benchmarks which will be acted upon. |
2 | A list of possible output formats, as well as notification when beginning and ending each phase of operation (build, setup, run, reporting). |
3 (default) | A list of each action performed during each phase of operation (e.g. "Building 176.gcc", "Setting up 253.perlbmk") |
4 | Notification of benchmarks excluded |
10 | Information on basepeak operation. |
12 | Errors during discovery of benchmarks and output formats. |
24 | Notification of additions to and replacements in the list of benchmarks. |
30 | A list of options included in the MD5 hash of options used to determine whether or not a given binary needs to be recompiled. |
35 | A list of key=value pairs that can be used in command and notes substitutions. |
40 | A list of 'submit' commands for each benchmark. |
70 | Information on selection of median results. |
99 | Gruesome detail of comparing MD5 hashes on files being copied during run directory setup. |
--- Messages at the following levels will always appear in your log files --- | |
100 | Error message if the 'pdflib' module can't be loaded. |
103 | A tally of successes and failures during the run broken down by benchmark. |
106 | A list of runtime and calculated ratio for each benchmark run. |
107 | Dividers to visually block each phase of the run. |
110 | Error messages about writing temporary files in the config file output format. |
120 | Messages about which commands are being issued for which benchmarks. |
125 | A listing of each individual child processes' start, end, and elapsed times. |
130 | A nice header with the time of the runspec invocation and the command line used. |
140 | General information about the settings for the current run. |
145 | Messages about file comparisons. |
150 | List of commands that will be run, and details about the settings used for comparing output files. Also the contents of the Makefile written. |
155 | Start, end, and elapsed times for benchmark run. |
160 | Start, end, and elapsed times for benchmark compilation. |
180 | stdout and stderr from commands run |
191 | Notification of command line used to run specinvoke. |
Feedback-directed optimization typically means that we want to compile a program twice: the first compile creates an image with certain instrumentation. Then, we run the program, and data about that run is collected (a profile). When the program is re-compiled, the collected data is used to improve the optimization.
The log file tells us what is written to the file Makefile.spec, along with lots of state information. It tells us that our config file set certain options:
----------------------------------- Building 197.parser ref base nov14a default Wrote to makefile '/cpu2000/benchspec/CINT2000/197.parser/run/00000002/Makefile.spec': PASS1_CFLAGS = -prof_gen_noopt -prof_dir /tmp/pb PASS2_CFLAGS = -prof_use_feedback -prof_dir /tmp/pb baseexe = parser fdo_pre0 = mkdir /tmp/pb; rm -f /tmp/pb/${baseexe}* feedback = 1To tell the tools that we want to use FDO, we simply set PASSn_CFLAGS. If the tools see any use of PASSn_xxxxx, they will perform multiple compiles. The particular compiler used in this example expects to be invoked twice: once with -prof_gen_noopt and then again with -prof_use_feedback. Note that we have also requested special processing before we start, in the fdo_pre0 step. The variable ${baseexe} is substituted with the name of the generated executable, minus any extensions or directories:
Issuing fdo_pre0 command 'mkdir /tmp/pb; rm -f /tmp/pb/parser*Below, the first compile is done. Notice that specmake is invoked with FDO=PASS1, which causes the switches from PASS1_CFLAGS to be used. (If you want to understand exactly how this affects the build, read $SPEC/benchspec/Makefile.defaults, along with the document $SPEC/docs/makevars.txt.)
Output from fdo_make_pass1 'specmake FDO=PASS1 build > fdo_make_pass1.out 2> fdo_make_pass1.err': cc -v -prof_gen_noopt -prof_dir /tmp/pb -DSPEC_CPU2000 -v -arch ev6 -fast analyze-linkage.c and.c build-disjuncts.c extract-links.c ...The next section shows how specinvoke runs the benchmark for the training run, according to the commands in speccmds.cmd. Specinvoke executes the instrumented parser using the training data set as input. (For more information on specinvoke, see utility.html.)
Training 197.parser Commands to run: -u /cpu2000/benchspec/CINT2000/197.parser/run/00000002 -i train.in -o train.out -e train.err ../00000002/parser 2.1.dict -batch Specinvoke: /cpu2000/bin/specinvoke -d /cpu2000/benchspec/CINT2000/197.parser/run/00000002 -e speccmds.err -o speccmds.out -f speccmds.cmdFinally, the compiler is run a second time, this time to use the profile feedback and build a new executable. Notice that this time, specmake is invoked with FDO=PASS2, which is why the compile picks up the PASS2_CFLAGS:
Output from fdo_make_pass2 'specmake FDO=PASS2 build > fdo_make_pass2.out 2> fdo_make_pass2.err': cc -v -prof_use_feedback -prof_dir /tmp/pb -DSPEC_CPU2000 -v -arch ev6 -fast analyze-linkage.c and.c build-disjuncts.c extract-links.c ... Compile for '197.parser' ended at:Thu Dec 16 23:35:48 1999 (945405348) Elapsed compile for '197.parser': 00:02:45 (165) Build CompleteAnd that's it. The tools did most of the work; the user simply set PASS1_CFLAGS, PASS2_CFLAGS, and fdo_pre0 in the config file.
cd $SPEC mv result result_old mkdir resultOn NT, you could say:
cd %SPEC% rename result result_old mkdir result
To find the build directory, look for the word "build" in run/list. If more than one build directory is present, you can pipe the output to search for the specific extension that you want.
For example:
F:\cpu2000> cd %SPEC%\benchspec\CINT2000\164.gzip\run F:\cpu2000\benchspec\CINT2000\164.gzip\run> findstr build list 00000001 dir=F:/cpu2000/benchspec/CINT2000/164.gzip/run/00000001 ext=oct14a lock=0 type=build username=Administrator F:\cpu2000\benchspec\CINT2000\164.gzip\run> cd *01
Makefile.spec | The components for make that were generated for the current config file with the current set of runspec options |
options.out | For 1 pass compile: build options summary |
options1.out | For N pass compile: summary of first pass |
options2.out | For N pass compile: summary of second pass |
make.out | For 1 pass compile: detailed commands generated |
fdo_make_pass1.out | For N pass compile: detailed commands generated for 1st pass |
fdo_make_pass2.out | For N pass compile: detailed commands generated for 2nd pass |
*.err | The output from standard error corresponding to the above files. |
Are there any obvious clues in the log file? Search for the word "Building". Keep searching until you hit the next benchmark AFTER the one that you are interested in. Now scroll backward one screenful.
Did your desired switches get applied? Go to the build directory, and look at options*out.
Did the tools or your compilers report any errors? Look in the build directory at *err.
What happens if you try the build by hand? See the section on specmake in utility.html.
If an actual run fails, what happens if you invoke the run by hand? See the information about "specinvoke -n" in utility.html