Guidelines for Using SPEC HPG Benchmarks in Research Publications
SPEC encourages the use of its benchmark suites in academic research.
The following guidelines are intended to help lead to research results
that are comparable and of practical relevance. SPEC recognizes the
need for flexibility in research publications. In such publications,
it allows the reporting of benchmark results that deviate from the otherwise
mandatory run rules. Note, however, that research publications that
have co-authors with industrial affiliations and that make competitive
comparisons of computer systems of different vendor companies have to
adhere to the run rules of the respective SPEC benchmarks. SPEC HPG
will judge borderline cases.
Research reports that make use of SPEC benchmarks are strongly encouraged
to observe the following guidelines:
- Use the most recent set of benchmarks. SPEC continuously updates
its benchmark suites to reflect new computing practices and system
resources.
- Use the unmodified benchmark codes. If source code modifications
are unavoidable for your research objective, describe all modifications.
- Use all benchmarks of the suite. If this is not feasible, e.g.,
some may be too large or too complicated for execution on a simulator,
explain the selection, to make clear that it is not arbitrary or biased
in the sense that the new idea gives good results only for some programs.
- The correct input data sets are called ref (reference set) for
the SPEC OMP2001 suite and small, medium, large, or x-large for the
SPEChpc96 suite. Use these data sets if at all possible. If not feasible,
clearly describe all changes you have made (e.g., "used the train
data set", "changed parameter xy in the ref data set to n," or "only
simulated the first 1 billion instructions of each code"). Note
that the test data sets are not meant to be realistic data sets. Their
use is for testing the benchmark installation only.
- Carefully describe the execution environment, including the compiler
and the compilation flags that were used. For example, the benefits
of a specific cache design can be radically different for compilations
with or without "prefetch" instruction that some modern compilers
are able to generate.
- Discuss properties of the benchmark that are relevant for the feature
that was studied. It should become clear from your explanation whether
the observed characteristics are true only for the studied benchmark(s)
or more generally. This is especially important if you did not use
the full benchmark suite.
- If possible, show the effects on other program collections also.
A thorough discussion of differences between the "SPEC benchmark
set" and the "non-SPEC program set" can provide interesting
insights, and can give valuable help to SPEC when it comes to the selection
of the next benchmark suite.
- Note that the SPEC metrics (such as SPECompMbase2001) are reserved
terms. Their use is restricted to reports that strictly adhere to the
benchmark suite's run rules and are approved by SPEC. SPEC urges
that results that do not adhere to these rules be clearly distinguished
by reporting execution times or by marking SPEC metrics as "estimates".
Submitting research papers that use SPEC benchmarks
Authors are encouraged to submit bibliographic entries of publications
that use SPEC HPG benchmarks to info@spec.org.
Your entry will be added to the SPEC/HPG Bibliography page.
If possible, please include a URL to the full publication.