Guidelines for Using SPEC HPG Benchmarks in Research Publications

SPEC encourages the use of its benchmark suites in academic research. The following guidelines are intended to help lead to research results that are comparable and of practical relevance. SPEC recognizes the need for flexibility in research publications. In such publications, it allows the reporting of benchmark results that deviate from the otherwise mandatory run rules. Note, however, that research publications that have co-authors with industrial affiliations and that make competitive comparisons of computer systems of different vendor companies have to adhere to the run rules of the respective SPEC benchmarks. SPEC HPG will judge borderline cases.

Research reports that make use of SPEC benchmarks are strongly encouraged to observe the following guidelines:

Use the most recent set of benchmarks. SPEC continuously updates its benchmark suites to reflect new computing practices and system resources.
Use the unmodified benchmark codes. If source code modifications are unavoidable for your research objective, describe all modifications.
Use all benchmarks of the suite. If this is not feasible, e.g., some may be too large or too complicated for execution on a simulator, explain the selection, to make clear that it is not arbitrary or biased in the sense that the new idea gives good results only for some programs.
The correct input data sets are called ref (reference set) for the SPEC OMP2001 suite and small, medium, large, or x-large for the SPEChpc96 suite. Use these data sets if at all possible. If not feasible, clearly describe all changes you have made (e.g., "used the train data set", "changed parameter xy in the ref data set to n," or "only simulated the first 1 billion instructions of each code"). Note that the test data sets are not meant to be realistic data sets. Their use is for testing the benchmark installation only.
Carefully describe the execution environment, including the compiler and the compilation flags that were used. For example, the benefits of a specific cache design can be radically different for compilations with or without "prefetch" instruction that some modern compilers are able to generate.
Discuss properties of the benchmark that are relevant for the feature that was studied. It should become clear from your explanation whether the observed characteristics are true only for the studied benchmark(s) or more generally. This is especially important if you did not use the full benchmark suite.
If possible, show the effects on other program collections also. A thorough discussion of differences between the "SPEC benchmark set" and the "non-SPEC program set" can provide interesting insights, and can give valuable help to SPEC when it comes to the selection of the next benchmark suite.
Note that the SPEC metrics (such as SPECompMbase2001) are reserved terms. Their use is restricted to reports that strictly adhere to the benchmark suite's run rules and are approved by SPEC. SPEC urges that results that do not adhere to these rules be clearly distinguished by reporting execution times or by marking SPEC metrics as "estimates".

Submitting research papers that use SPEC benchmarks

Authors are encouraged to submit bibliographic entries of publications that use SPEC HPG benchmarks to info@spec.org. Your entry will be added to the SPEC/HPG Bibliography page. If possible, please include a URL to the full publication.

Standard Performance Evaluation Corporation

Guidelines for Using SPEC HPG Benchmarks in Research Publications

Research reports that make use of SPEC benchmarks are strongly encouraged to observe the following guidelines:

Submitting research papers that use SPEC benchmarks