Standard Performance Evaluation Corporation |
|
|
SPEC Discusses the History & Reasoning Behind SPEC95
Jeff Reilly Published September, 1995; see disclaimer. Introduction & HistoryThe Standard Performance Evaluation Corporation (SPEC) established itself in 1988 as a non-profit corporation devoted to "establishing, maintaining and endorsing a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers." SPEC's first product was the SPEC Benchmark Release 1 suite, later identified as SPEC89, which provided a standardized measure of compute-intensive microprocessor performance. This product replaced the vague and confusing MIPS and MFLOPS ratings then used in the computer industry. SPEC had developed benchmarks from real applications and provided them as source code to allow compilation on various UNIX workstations. While this did include the memory subsystem and the compiler in the performance measurement, it did ensure that all platforms would perform precisely the same task, providing comparability across different architectures. Time progressed. The computer industry adopted the SPEC89 benchmark results because they provided a fair, standard way to compare compute-intensive performance across various hardware architectures. But time also brought change. Technology drastically improved processor, system and compiler performance. And because of this, benchmarks were required to evolve and adapt to continue to be useful. SPEC obsoleted SPEC89 with the release of SPEC92 in January 1992. Now in the summer of 1995, SPEC is replacing SPEC92 with an improved SPEC95, with the subcomponents CINT95 (focusing on integer/non-floating point compute-intensive activity) and CFP95 (focusing on floating-point compute-intensive activity). Intent of SPEC95SPEC95 was designed to provide a comparable measure of performance of a system executing a known compute-intensive workload. In order to do this across the widest range of platforms, SPEC chose to continue to provide benchmarks in source code form. Thus, despite the fact that SPEC benchmarks are often discussed as just processor benchmarks, they actually emphasize three components of a system:
All of these components should be kept in mind when considering SPEC95 results. Note that SPEC95 is not designed to measure graphics, networking, I/O or operating system features. While it may be possible to intentionally configure a degenerate system such that those components affect SPEC95 performance, this is neither the intent nor focus of SPEC95. In SPEC95, SPEC provides an applicable means of providing processor performance for the next several years. Improvements Over SPEC92The same intentions existed during the creation of SPEC92, however times have changed. Areas of improvement or issues that needed to be resolved and motivated the move from SPEC92 to SPEC95 include: RuntimeSeveral of the SPEC92 benchmarks were running in less than a minute on the leading edge processors/ systems. Given the SPEC measurement tools, small changes or fluctuations in the measurements were having significant impacts on the percentage improvements being seen. SPEC chose to make the benchmarks longer to take into account future performance and prevent this from being an issue over the expected lifetime of the suite. Application sizeMany comments received by SPEC indicated that applications had grown in complexity and size and that the SPEC92 suites were becoming less representative of what was being run on current systems. One of the criteria used in selecting benchmarks was seeking some programs with larger resource requirements to provide a mix with some of the smaller programs. Application typeSPEC felt that there were additional application areas that should be included to increase the variety within the suites. Areas such as imaging and database were added. PortabilitySPEC found that compute-intensive performance was important beyond the UNIX workstation arena where SPEC was founded. Thus, it was important that the benchmarks and the tools running the benchmarks be as independent of the operating system as possible. While the first release of SPEC95 will be geared toward UNIX, SPEC has consciously chosen programs and tools that are dependent on POSIX or ANSI standard development environments. SPEC will produce additional releases for other operating systems (e.g. WIN/NT) based on demand. Moving targetThe initial hope for source code benchmarks is that improvements in the test will be generally applicable to other situations. However, as the competition develops, it is feared that improvements in the test performance become specific to that test only. By frequently updating the benchmarks, it is hoped that test specific optimization becomes less specific and that general improvements will be encouraged. EducationAs the computer industry grows, benchmark results are being quoted more often. With the release of a new suite, this is a new opportunity to discuss and clarify how and why the suite was developed. ConclusionSPEC95 is a step forward for compute-intensive benchmarking. SPEC has made efforts to improve the quality of the applications used, to improve the portability of the applications and to improve the ease of use and the portability of the tools. The challenge for SPEC now is education. Benchmarks provide a useful approximation to reality and SPEC needs to improve the guidance with these tools to help the users of the benchmark results understand how and when to use the results. SidebarJeff Reilly is the Release Manager for SPEC95 and is a Project Lead for Intel Corporation in Santa Clara, Calif. |