SPEC MPI2007 Result File Fields

Last updated: 11 Nov 2009 hf & tre, original document by cgp

ABSTRACT
This document describes the contents and arrangement of a SPEC MPI2007 result disclosure. In this document, we will refer to the arrangement of fields in the HTML format of a SPEC MPI2007 result disclosure, since there may be differences in arrangement of result fields between the text, HTML and other report formats (CSV, PDF, PS, etc.). While the reports are formatted in a way that is intended to be self-explanatory to the reader, he/she may desire a formal statement of the meaning of a field, or have technical questions about information provided in the fields. Further, the SPEC website contain links from the fields of the published reports to their descriptions in this document.

The contents of the result reports are either generated by the run of the benchmarks, extracted from the configuration file that controls the building and running of the benchmarks, or are provided by descriptive fields filled in by the tester. These follow conventions that are specified in the separate documents on the Run Rules, Config Files, and XML Flags Files. Reports published on the SPEC website have been peer-reviewed by the members of the SPEC/HPG committee and are expected to be correct in every detail.

Abbreviated Contents

Selecting one of the following will take you to the detailed table of contents for that section or subsection:

Detailed Contents

1. SPEC MPI2007 Benchmarks

1.1 Benchmarks by suite

1.1.1 Benchmarks in the MPIM2007 suite

The MPIM2007 suite is comprised of 13 floating-point compute intensive codes; 4 in Fortran, 2 in C, 1 in C++, and 6 which contain both Fortran and C.

1.1.2 Benchmarks in the MPIL2007 suite

The MPIL2007 suite is comprised of 12 floating-point compute intensive codes; 4 in Fortran, 3 in C, 1 in C++, and 4 which contain both Fortran and C.

1.2 Benchmarks by language

1.2.1 C Benchmarks

1.2.2 C++ Benchmarks

1.2.3 Fortran Benchmarks

1.2.4 Benchmarks using both Fortran and C

2. Result and Configuration Summary

2.1 Top bar

2.1.1 MPIM2007 Result

2.1.2 System Vendor

2.1.3 System Name

2.1.4 SPECmpiM_peak2007

The geometric mean of thirteen normalized ratios (one for each benchmark) when compiled with aggressive optimization for each benchmark.

2.1.5 SPECmpiM_base2007

The geometric mean of thirteen normalized ratios when compiled with conservative optimization for each benchmark.

2.1.6 SPECmpiL_peak2007

The geometric mean of thirteen normalized ratios (one for each benchmark when run on the Large/lref workloads) when compiled with aggressive optimization for each benchmark.

2.1.7 SPECmpiL_base2007

The geometric mean of thirteen normalized ratios (computed from run times and reference times when run on the lref workloads) when compiled with conservative optimization for each benchmark.

2.1.8 MPI2007 license #

The SPEC CPU license number of the organization or individual that ran the result.

2.1.9 Hardware Availability

The date when all the hardware necessary to run the result is generally available. For example, if the CPU is available in Aug-2007, but the memory is not available until Oct-2007, then the hardware availability date is Oct-2007 (unless some other component pushes it out farther).

2.1.10 Software Availability

The date when all the software necessary to run the result is generally available. For example, if the operating system is available in Aug-2007, but the compiler or other libraries are not available until Oct-2007, then the software availability date is Oct-2007 (unless some other component pushes it out farther).

2.1.11 Test date

The date when the test is run. This value is supplied by the tester; the time reported by the system under test is recorded in the raw result file.

2.1.12 Test sponsor

The name of the organization or individual that sponsored the test. Generally, this is the name of the license holder.

2.1.13 Tested by

The name of the organization or individual that ran the test. If there are installations in multiple geographic locations, sometimes that will also be listed in this field.

2.2 Results table

In addition to the graph, the results of the individual benchmark runs are also presented in table form.

2.2.1 Benchmark

2.2.2 Ranks

This column indicates the number of MPI ranks (processes) that were used during the running of the benchmark.

2.2.3 Seconds

This is the amount of elapsed (wall) time in seconds that the benchmark took to run from job submit to job completion.

2.2.4 Ratio

2.2.5 Identifying the Median results

For a reportable MPI2007 run, at least two iterations of each benchmark are run, and the median of the runs (lower of middle two, if even) is selected to be part of the overall metric. In output formats that support it, the medians in the result table are underlined in bold. The ".txt" report will mark each median score with an asterisk "*".

2.2.6 Significance of the run order

Each iteration in the MPI2007 benchmark suite will run each benchmark once, in order. For example, given benchmarks "910.aaa", "920.bbb", and "930.ccc", here's what you might see as the benchmarks were run if they were part of each suite:

When you read the results table from a run, such as this one, the results in the results table are listed in the order that they were run, in column-major order. In other words, if you're interested in the base scores as they were produced, start in the upper-lefthand column and read down the first column, then read the middle column, then the right column.

If the benchmarks were run with both base and peak tuning, all base runs were completed before starting peak.

2.3 System Summary

Collective hardware details across the whole system. Run rules relating to these items can be found in section 4.2 of the MPI2007 Run and Reporting Rules.

2.3.1 Type of System

Description of the system being benchmarked: SMP, Homogeneous Cluster or Heterogeneous Cluster.

2.3.2 Compute Node

2.3.3 Interconnects

2.3.4 File Server Node

2.3.5 Head Node

This system is the head node, a node which is used as the lead for the the benchmark run.

2.3.6 Other Node

This system is part of the entire configuration and has purpose other than those previously mentioned.

2.3.7 Total Compute Nodes

2.3.8 Total Chips

The total number of chips in the compute nodes available to execute the benchmark.

2.3.9 Total Cores

The total number of cores in the compute nodes available to execute the benchmark.

2.3.10 Total Threads

The total number of threads in the compute nodes available to execute the benchmark.

2.3.11 Total Memory

The total amount of memory in all of the compute nodes available to execute the benchmark.

2.3.12 Maximum Ranks Run

The number of MPI ranks used to execute the benchmark on the base optimization runs.

2.3.13 Minimum Peak Ranks

The smallest number of ranks used to execute the benchmark runs using peak optimizations.

2.3.14 Maximum Peak Ranks

2.4 Benchmark Configuration

Information on how the benchmark binaries are constructed. Run rules relating to these items can be found in section 4.2 of the MPI2007 Run and Reporting Rules.

2.4.1 C Compiler

The names and versions of C compilers, preprocessors, and performance libraries used to generate the result.

2.4.2 C++ Compiler

The names and versions of C++ compilers, preprocessors, and performance libraries used to generate the result.

2.4.3 Fortran Compiler

The names and versions of Fortran compilers, preprocessors, and performance libraries used to generate the result.

2.4.4 Base Pointers

Indicates whether all the benchmarks in base used 32-bit pointers, 64-bit pointers, or a mixture. For example, if the C and C++ benchmarks used 32-bit pointers, and the Fortran benchmarks used 64-bit pointers, then "32/64-bit" would be reported here.

2.4.5 Peak Pointers

Indicates whether all the benchmarks in peak used 32-bit pointers, 64-bit pointers, or a mixture.

2.4.6 MPI Library

2.4.7 Other MPI Information

2.4.8 Pre-processors

2.4.9 Other Software

Any performance-relevant non-compiler software used, including third-party libraries, accelerators, etc.

3 Hardware Description

SPEC mpi2007 is capable of running on large heterogeneous clusters containing different kinds of nodes linked by different kinds of interconnects. The report format contains a separate section for each kind of node and each kind of interconnect. Section 4.2 of the Run Rules document describes what information is to be provided by the tester.

For example, an SMP will consist of one node and no interconnect. Homogeneous cluster systems will typically consist of one kind of compute node and one or two kinds of interconnects. There will also often be a file server node. It is possible that the node and interconnect components are available from their respective vendors but no vendor sells the configured system as a whole; in this case the report is intended to provide enough detail to reconstruct an equivalent system with equivalent performance.

3.1 Node Description(s)

3.1.1 Hardware

3.1.1.1 Number of nodes

3.1.1.2 Uses of the Node

3.1.1.3 Vendor

3.1.1.4 Model

3.1.1.5 CPU Name

3.1.1.6 CPU(s) orderable

3.1.1.7 Chip(s)/CPU(s) enabled

The number of Chips (CPUs) that were enabled and active in the node during the benchmark run.

3.1.1.8 Core(s) enabled

The number of cores that were enabled and active in the node during the benchmark run.

3.1.1.9 Cores per Chip

The number of cores in each chip that were enabled and active in the node during the benchmark run.

3.1.1.10 Threads per Core

The number of threads in each core that were enabled and active in the node during the benchmark run.

3.1.1.11 CPU Characteristics

3.1.1.12 CPU MHz

3.1.1.13 Primary Cache

Description (size and organization) of the CPU's primary cache. This cache is also referred to as "L1 cache".

3.1.1.14 Secondary Cache

Description (size and organization) of the CPU's secondary cache. This cache is also referred to as "L2 cache".

3.1.1.15 L3 Cache

3.1.1.16 Other Cache

3.1.1.17 Memory

Description of the system main memory configuration. End-user options that affect performance, such as arrangement of memory modules, interleaving, latency, etc, are documented here.

3.1.1.18 Disk Subsystem

A description of the disk subsystem (size, type, and RAID level if any) of the storage used to hold the benchmark tree during the run.

3.1.1.19 Other Hardware

3.1.1.20 Adapter Card(s)

There will be one of these groups of entries for each network adapter -- aka Host Channel Adapter (HCA) or Network Interface Card (NIC) -- used to connect to an interconnect to carry MPI or file server traffic.

3.1.1.20.1 Number of Adapters

3.1.1.20.2 Adapter Slot Type

3.1.1.20.3 Data Rate

3.1.1.20.4 Ports Used

The number of ports used to run the benchmark on the adapter (especially for those which have multiple ports available).

3.1.1.20.5 Interconnect Type

In general terms, the type of interconnect (Ethernet, InfiniBand, etc.) attached to this adapter.

3.1.2 Software

3.1.2.1 Adapter Driver

3.1.2.2 Firmware

3.1.2.3 Operating System

The operating system name and version. If there are patches applied that affect performance, they must be disclosed in the Notes.

3.1.2.4 Local File System

3.1.2.5 Shared File System

3.1.2.6 System State

The state (sometimes called "run level") of the system while the benchmarks were being run. Generally, this is "single user", "multi-user", "default", etc.

3.1.2.7 Other Software

Any performance-relevant non-compiler software used, including third-party libraries, accelerators, etc.

3.1.3 Notes

Free-form notes about other hardware or software details of the node, such as changes to the default operating system state, changes to the default hardware state of the node, or other tuning information.

3.2 Interconnect Description(s)

3.2.1 Vendor

3.2.2 Model

The model name(s) of the interconnect as a whole, or components of it -- not including the switch model, which is the next field.

3.2.3 Switch Model(s)

The model and manufacturer of the switching element(s) of this interconnect. There may be more than one kind declared.

3.2.3.1 Number of switches

3.2.3.2 Ports per switch

The number of ports per switch available for carrying the type of traffic noted in the "Primary Use" field.

4. Compilation Description

This section describes how the benchmarks are compiled. The HTML and PDF reports contain links from the settings that are listed, to the descriptions of those settings in the XML flags file report.

Much information is derived from compilation rules written into the config file and interpreted according to rules specified in the XML flags file. Free-form notes can be added to this. Sections only show up if the corresponding flags are used, such as peak optimization flags; otherwise the section is not printed. Section 2 of the MPI2007 Run and Reporting Rules document gives rules on how these items can be used in reportable runs.

4.1 Base & Peak Unknown Flags

This section lists flags, used in the base or peak compiles, that were not recognized by the report generation. Results with unknown flags are marked "invalid" and may not be published.

Likely the flagsurl parameter was not set correctly, or details need to be added to the XML flags file. The "invalid" marking may be removed by reformatting the result using a flags file that describes all of the unknown flags.

4.2 Base & Peak Forbidden Flags

This section lists flags, used in the base or peak compiles, that are designated as Forbidden in the XML flags file for the benchmark or the platform. Results with forbidden flags are marked "invalid" and may not be published.

4.3 Base & Peak Compiler Invocation

This section describes how the compilers are invoked, whether any special paths had to be used or flags were passed, etc.

4.3.1 Notes

4.4 Base & Peak Portability Flags

This section describes the portability settings that are used to build the benchmarks. Optimization settings are not listed here.

4.4.1 C Benchmarks

4.4.2 C++ Benchmarks

4.4.3 Fortran benchmarks

4.4.4 Benchmarks using both Fortran and C

4.4.5 Notes

4.5 Base & Peak Optimization Flags

This section describes the optimizations settings that are used to build the benchmark binaries for the base and peak runs.

4.5.1 C Benchmarks

4.5.2 C++ Benchmarks

4.5.3 Fortran benchmarks

4.5.4 Benchmarks using both Fortran and C

4.5.5 Notes

4.6 Base & Peak Other Flags

This section describes the other settings that are used to build or run the benchmark binaries for the base and peak runs. These are classified as being neither portability nor optimization settings.

4.6.1 C Benchmarks

4.6.2 C++ Benchmarks

4.6.3 Fortran benchmarks

4.6.4 Benchmarks using both Fortran and C

4.6.5 Notes

5. General Notes

This section is where the tester provides notes about things not covered in the other notes sections.

6. Errors

This section is automatically inserted by the benchmark tools when there are errors present that prevent the result from being a valid reportable result.