SPEC MPI2007 Result File Fields

Last updated: 11 Nov 2009 hf & tre, original document by cgp

ABSTRACT
This document describes the contents and arrangement of a SPEC MPI2007 result disclosure. In this document, we will refer to the arrangement of fields in the HTML format of a SPEC MPI2007 result disclosure, since there may be differences in arrangement of result fields between the text, HTML and other report formats (CSV, PDF, PS, etc.). While the reports are formatted in a way that is intended to be self-explanatory to the reader, he/she may desire a formal statement of the meaning of a field, or have technical questions about information provided in the fields. Further, the SPEC website contain links from the fields of the published reports to their descriptions in this document.

The contents of the result reports are either generated by the run of the benchmarks, extracted from the configuration file that controls the building and running of the benchmarks, or are provided by descriptive fields filled in by the tester. These follow conventions that are specified in the separate documents on the Run Rules, Config Files, and XML Flags Files. Reports published on the SPEC website have been peer-reviewed by the members of the SPEC/HPG committee and are expected to be correct in every detail.


(To check for possible updates to this document, please see http://www.spec.org/mpi2007/Docs/)

Abbreviated Contents

Selecting one of the following will take you to the detailed table of contents for that section or subsection:

1. SPEC MPI2007 Benchmarks

1.1 Benchmarks by suite

1.2 Benchmarks by language

2. Result and Configuration Summary

2.1 Top bar

2.2 Results table

2.3 System Summary

2.4 Benchmark Configuration

3. Hardware Description

3.1 Node Description(s)

3.2 Interconnect Description(s)

4. Compilation Description

4.1 Base & Peak Unknown Flags

4.2 Base & Peak Forbidden Flags

4.3 Base & Peak Compiler Invocation

4.4 Base & Peak Portability Flags

4.5 Base & Peak Optimization Flags

4.6 Base & Peak Other Flags

5. General Notes

6. Errors

Detailed Contents

1. SPEC MPI2007 Benchmarks

1.1 Benchmarks by suite

1.1.1 Benchmarks in the medium suite

1.1.2 Benchmarks in the large suite

1.2 Benchmarks by language

1.2.1 C Benchmarks

1.2.2 C++ Benchmarks

1.2.3 Fortran Benchmarks

1.2.4 Benchmarks using both Fortran and C

2. Result and Configuration Summary

2.1 Top bar

2.1.1 MPIM2007 Result

2.1.2 System Vendor

2.1.3 System Name

2.1.4 SPECmpiM_peak2007

2.1.5 SPECmpiM_base2007

2.1.6 SPECmpiL_peak2007

2.1.7 SPECmpiL_base2007

2.1.8 MPI2007 license #

2.1.9 Hardware Availability

2.1.10 Software Availability

2.1.11 Test date

2.1.12 Test sponsor

2.1.13 Tested by

2.2 Results table

2.2.1 Benchmark

2.2.2 Ranks

2.2.3 Seconds

2.2.4 Ratio

2.2.5 Identifying the Median results

2.2.6 Significance of the run order

2.3 System Summary

2.3.1 Type of System

2.3.2 Compute Node

2.3.3 InterConnects

2.3.4 File Server Node

2.3.5 Head Node

2.3.6 Other Node

2.3.7 Total Compute Nodes

2.3.8 Total Chips

2.3.9 Total Cores

2.3.10 Total Threads

2.3.11 Total Memory

2.3.12 Maximum Ranks Run

2.3.13 Minimum Peak Ranks

2.3.14 Maximum Peak Ranks

2.4 Benchmark Configuration

2.4.1 C Compiler

2.4.2 C++ Compiler

2.4.3 Fortran Compiler

2.4.4 Auto Parallel

2.4.5 Base Pointers

2.4.6 Peak Pointers

2.4.7 MPI Library

2.4.8 Other MPI Information

2.4.9 Pre-processors

2.4.10 Other Software

3. Hardware Description

3.1 Node Description(s)

3.1.1 Hardware

3.1.1.1 Node class

3.1.1.2 Number of nodes

3.1.1.3 Vendor

3.1.1.4 Model

3.1.1.5 CPU Name

3.1.1.6 CPU(s) orderable

3.1.1.7 Chips(s) enabled

3.1.1.8 Core(s) enabled

3.1.1.9 Cores per Chip

3.1.1.10 Threads per Core

3.1.1.11 CPU Characteristics

3.1.1.12 CPU MHz

3.1.1.13 Primary Cache

3.1.1.14 Secondary Cache

3.1.1.15 L3 Cache

3.1.1.16 Other Cache

3.1.1.17 Memory

3.1.1.18 Disk Subsystem

3.1.1.19 Other Hardware

3.1.1.20 Adapter Card Model Name

3.1.1.20.1 Adapter Count

3.1.1.20.2 Adapter Slot Type

3.1.1.20.3 Data Rate

3.1.1.20.4 Ports Used

3.1.1.20.5 Adapter Interconnect

3.1.2 Software

3.1.2.1 Adapter Driver

3.1.2.2 Firmware

3.1.2.3 Operating System

3.1.2.4 Local File System

3.1.2.5 Shared File System

3.1.2.6 System State

3.1.2.7 Other Software

3.1.3 Node Notes

3.2 Interconnect Description(s)

3.2.1 Vendor

3.2.2 Model

3.2.3 Switch Model(s)

3.2.3.1 Number of switches

3.2.3.2 Ports per switch

3.2.3.3 Data Rate

3.2.3.4 Firmware

3.2.4 Topology

3.2.5 Primary Use

3.2.6 Notes

4. Compilation Description

4.1 Base & Peak Unknown Flags

4.2 Peak & Peak Forbidden Flags

4.3 Base & Peak Compiler Invocation

4.3.1 Notes

4.4 Base & Peak Portability Flags

4.4.1 C benchmarks

4.4.2 C++ benchmarks

4.4.3 Fortran benchmarks

4.4.4 Benchmarks using both Fortran and C

4.4.5 Notes

4.5 Base & Peak Optimization Flags

4.5.1 C benchmarks

4.5.2 C++ benchmarks

4.5.3 Fortran benchmarks

4.5.4 Benchmarks using both Fortran and C

4.5.5 Notes

4.6 Base & Peak Other Flags

4.6.1 C benchmarks

4.6.2 C++ benchmarks

4.6.3 Fortran benchmarks

4.6.4 Benchmarks using both Fortran and C

4.6.5 Notes

5. General Notes

6. Errors

1. SPEC MPI2007 Benchmarks

1.1 Benchmarks by suite

1.1.1 Benchmarks in the MPIM2007 suite

The MPIM2007 suite is comprised of 13 floating-point compute intensive codes; 4 in Fortran, 2 in C, 1 in C++, and 6 which contain both Fortran and C.

  1. 104.milc (C)
  2. 107.leslie3d (Fortran)
  3. 113.GemsFDTD (Fortran)
  4. 115.fds4 (Fortran and C)
  5. 121.pop2 (Fortran and C)
  6. 122.tachyon (C)
  7. 126.lammps (C++)
  8. 127.wrf2 (Fortran and C)
  9. 128.GAPgeofem (Fortran and C)
  10. 129.tera_tf (Fortran)
  11. 130.socorro (Fortran/C)
  12. 132.zeusmp2 (Fortran/C)
  13. 137.lu (Fortran)

1.1.2 Benchmarks in the MPIL2007 suite

The MPIL2007 suite is comprised of 12 floating-point compute intensive codes; 4 in Fortran, 3 in C, 1 in C++, and 4 which contain both Fortran and C.

  1. 121.pop2 (Fortran and C)
  2. 122.tachyon (C)
  3. 125.RAxML (C)
  4. 126.lammps (C++)
  5. 128.GAPgeofem (Fortran and C)
  6. 129.tera_tf (Fortran)
  7. 130.socorro (Fortran/C)
  8. 132.zeusmp2 (Fortran/C)
  9. 137.lu (Fortran)
  10. 142.dmilc (C)
  11. 143.dleslie (Fortran)
  12. 145.lGemsFDTD (Fortran)
  13. 147.l2wrf2 (Fortran and C)

1.2 Benchmarks by language

1.2.1 C Benchmarks

Two benchmarks in the MPIM2007 suite are written in C:

Three benchmarks in the MPIL2007 suite are written in C:

1.2.2 C++ Benchmarks

One benchmark in the MPIM2007 and MPIL2007 suites is written in C++:

1.2.3 Fortran Benchmarks

Four benchmarks in the MPIM2007 suite are written in Fortran:

Four benchmarks in the MPIL2007 suite are written in Fortran:

1.2.4 Benchmarks using both Fortran and C

Six benchmarks in the MPIM2007 suite are written using both Fortran and C:

Four benchmarks in the MPIL2007 suite are written using both Fortran and C:

2. Result and Configuration Summary

2.1 Top bar

More detailed information about metrics is in sections 4.3.1 and 4.3.2 of the MPI2007 Run and Reporting Rules.

2.1.1 MPIM2007 Result

This result is from the MPIM2007 suite using the medium (or mref) dataset.

2.1.2 System Vendor

The vendor of the system under test.

2.1.3 System Name

The name of the system under test.

2.1.4 SPECmpiM_peak2007

The geometric mean of thirteen normalized ratios (one for each benchmark) when compiled with aggressive optimization for each benchmark.

More detailed information about this metric is in section 4.3.1 of the MPI2007 Run and Reporting Rules.

2.1.5 SPECmpiM_base2007

The geometric mean of thirteen normalized ratios when compiled with conservative optimization for each benchmark.

More detailed information about this metric is in section 4.3.1 of the MPI2007 Run and Reporting Rules.

2.1.6 SPECmpiL_peak2007

The geometric mean of thirteen normalized ratios (one for each benchmark when run on the Large/lref workloads) when compiled with aggressive optimization for each benchmark.

More detailed information about this metric is in section 4.3.1 of the MPI2007 Run and Reporting Rules.

2.1.7 SPECmpiL_base2007

The geometric mean of thirteen normalized ratios (computed from run times and reference times when run on the lref workloads) when compiled with conservative optimization for each benchmark.

More detailed information about this metric is in section 4.3.1 of the MPI2007 Run and Reporting Rules.

2.1.8 MPI2007 license #

The SPEC CPU license number of the organization or individual that ran the result.

2.1.9 Hardware Availability

The date when all the hardware necessary to run the result is generally available. For example, if the CPU is available in Aug-2007, but the memory is not available until Oct-2007, then the hardware availability date is Oct-2007 (unless some other component pushes it out farther).

2.1.10 Software Availability

The date when all the software necessary to run the result is generally available. For example, if the operating system is available in Aug-2007, but the compiler or other libraries are not available until Oct-2007, then the software availability date is Oct-2007 (unless some other component pushes it out farther).

2.1.11 Test date

The date when the test is run. This value is supplied by the tester; the time reported by the system under test is recorded in the raw result file.

2.1.12 Test sponsor

The name of the organization or individual that sponsored the test. Generally, this is the name of the license holder.

2.1.13 Tested by

The name of the organization or individual that ran the test. If there are installations in multiple geographic locations, sometimes that will also be listed in this field.


2.2 Results table

In addition to the graph, the results of the individual benchmark runs are also presented in table form.

2.2.1 Benchmark

The name of the benchmarks making up this MPI2007 suite.

2.2.2 Ranks

This column indicates the number of MPI ranks (processes) that were used during the running of the benchmark.

2.2.3 Seconds

This is the amount of elapsed (wall) time in seconds that the benchmark took to run from job submit to job completion.

2.2.4 Ratio

This is the ratio of benchmark run time (number of seconds) to the run time on the reference platform.

2.2.5 Identifying the Median results

For a reportable MPI2007 run, at least two iterations of each benchmark are run, and the median of the runs (lower of middle two, if even) is selected to be part of the overall metric. In output formats that support it, the medians in the result table are underlined in bold. The ".txt" report will mark each median score with an asterisk "*".

2.2.6 Significance of the run order

Each iteration in the MPI2007 benchmark suite will run each benchmark once, in order. For example, given benchmarks "910.aaa", "920.bbb", and "930.ccc", here's what you might see as the benchmarks were run if they were part of each suite:

MPI2007

    Running (#1) 910.aaa ref base oct09a default
    Running (#1) 920.bbb ref base oct09a default
    Running (#1) 930.ccc ref base oct09a default
    Running (#2) 910.aaa ref base oct09a default
    Running (#2) 920.bbb ref base oct09a default
    Running (#2) 930.ccc ref base oct09a default
    Running (#3) 910.aaa ref base oct09a default
    Running (#3) 920.bbb ref base oct09a default
    Running (#3) 930.ccc ref base oct09a default

When you read the results table from a run, such as this one, the results in the results table are listed in the order that they were run, in column-major order. In other words, if you're interested in the base scores as they were produced, start in the upper-lefthand column and read down the first column, then read the middle column, then the right column.

If the benchmarks were run with both base and peak tuning, all base runs were completed before starting peak.


2.3 System Summary

Collective hardware details across the whole system. Run rules relating to these items can be found in section 4.2 of the MPI2007 Run and Reporting Rules.

2.3.1 Type of System

Description of the system being benchmarked: SMP, Homogeneous Cluster or Heterogeneous Cluster.

2.3.2 Compute Node

These systems are used as compute nodes during the run of the benchmark.

2.3.3 Interconnects

These devices are used for the interconnects during the benchmark run.

2.3.4 File Server Node

These systems are used as the file server for the benchmark run.

2.3.5 Head Node

This system is the head node, a node which is used as the lead for the the benchmark run.

2.3.6 Other Node

This system is part of the entire configuration and has purpose other than those previously mentioned.

2.3.7 Total Compute Nodes

Number of compute nodes used to execute the benchmark.

2.3.8 Total Chips

The total number of chips in the compute nodes available to execute the benchmark.

2.3.9 Total Cores

The total number of cores in the compute nodes available to execute the benchmark.

2.3.10 Total Threads

The total number of threads in the compute nodes available to execute the benchmark.

2.3.11 Total Memory

The total amount of memory in all of the compute nodes available to execute the benchmark.

2.3.12 Maximum Ranks Run

The number of MPI ranks used to execute the benchmark on the base optimization runs.

2.3.13 Minimum Peak Ranks

The smallest number of ranks used to execute the benchmark runs using peak optimizations.

2.3.14 Maximum Peak Ranks

The largest number of ranks used to execute the peak version of the benchmark.


2.4 Benchmark Configuration

Information on how the benchmark binaries are constructed. Run rules relating to these items can be found in section 4.2 of the MPI2007 Run and Reporting Rules.

2.4.1 C Compiler

The names and versions of C compilers, preprocessors, and performance libraries used to generate the result.

2.4.2 C++ Compiler

The names and versions of C++ compilers, preprocessors, and performance libraries used to generate the result.

2.4.3 Fortran Compiler

The names and versions of Fortran compilers, preprocessors, and performance libraries used to generate the result.

2.4.4 Base Pointers

Indicates whether all the benchmarks in base used 32-bit pointers, 64-bit pointers, or a mixture. For example, if the C and C++ benchmarks used 32-bit pointers, and the Fortran benchmarks used 64-bit pointers, then "32/64-bit" would be reported here.

2.4.5 Peak Pointers

Indicates whether all the benchmarks in peak used 32-bit pointers, 64-bit pointers, or a mixture.

2.4.6 MPI Library

The names and versions of MPI Libraries used to generate the result.

2.4.7 Other MPI Information

Any performance-relevant MPI information used to generate the result.

2.4.8 Pre-processors

Any performance-relevant pre-processors that were used to generate the result.

2.4.9 Other Software

Any performance-relevant non-compiler software used, including third-party libraries, accelerators, etc.

3 Hardware Description

SPEC mpi2007 is capable of running on large heterogeneous clusters containing different kinds of nodes linked by different kinds of interconnects. The report format contains a separate section for each kind of node and each kind of interconnect. Section 4.2 of the Run Rules document describes what information is to be provided by the tester.

For example, an SMP will consist of one node and no interconnect. Homogeneous cluster systems will typically consist of one kind of compute node and one or two kinds of interconnects. There will also often be a file server node. It is possible that the node and interconnect components are available from their respective vendors but no vendor sells the configured system as a whole; in this case the report is intended to provide enough detail to reconstruct an equivalent system with equivalent performance.

3.1 Node Description(s)

Description of the hardware and software configuration of the node.

3.1.1 Hardware

Hardware configuration of the node.

3.1.1.1 Number of nodes

The number of nodes of this type in the system.

3.1.1.2 Uses of the Node

The purpose of this type of node: compute node, file server, head node, etc.

3.1.1.3 Vendor

The manufacturer of this kind of node.

3.1.1.4 Model

The model name of this kind of node.

3.1.1.5 CPU Name

A manufacturer-determined formal name of the processor used in this node type.

3.1.1.6 CPU(s) orderable

The number of CPUs that can be ordered in this kind of node.

3.1.1.7 Chip(s)/CPU(s) enabled

The number of Chips (CPUs) that were enabled and active in the node during the benchmark run.

3.1.1.8 Core(s) enabled

The number of cores that were enabled and active in the node during the benchmark run.

3.1.1.9 Cores per Chip

The number of cores in each chip that were enabled and active in the node during the benchmark run.

3.1.1.10 Threads per Core

The number of threads in each core that were enabled and active in the node during the benchmark run.

3.1.1.11 CPU Characteristics

Technical characteristics to help identify the processor type used in the node.

3.1.1.12 CPU MHz

The clock frequency of the CPU used in the node, expressed in megahertz.

3.1.1.13 Primary Cache

Description (size and organization) of the CPU's primary cache. This cache is also referred to as "L1 cache".

3.1.1.14 Secondary Cache

Description (size and organization) of the CPU's secondary cache. This cache is also referred to as "L2 cache".

3.1.1.15 L3 Cache

Description (size and organization) of the CPU's tertiary, or "Level 3" cache.

3.1.1.16 Other Cache

Description (size and organization) of any other levels of cache memory.

3.1.1.17 Memory

Description of the system main memory configuration. End-user options that affect performance, such as arrangement of memory modules, interleaving, latency, etc, are documented here.

3.1.1.18 Disk Subsystem

A description of the disk subsystem (size, type, and RAID level if any) of the storage used to hold the benchmark tree during the run.

3.1.1.19 Other Hardware

Any additional equipment added to improve performance.

3.1.1.20 Adapter Card(s)

There will be one of these groups of entries for each network adapter -- aka Host Channel Adapter (HCA) or Network Interface Card (NIC) -- used to connect to an interconnect to carry MPI or file server traffic.

This field contains this adapter's vendor and model name.

3.1.1.20.1 Number of Adapters

How many of these adapters attach to the node.

3.1.1.20.2 Adapter Slot Type

The type of slot used to attach the adapter card to the node.

3.1.1.20.3 Data Rate

The per-port, nominal data transfer rate of the adapter.

3.1.1.20.4 Ports Used

The number of ports used to run the benchmark on the adapter (especially for those which have multiple ports available).

3.1.1.20.5 Interconnect Type

In general terms, the type of interconnect (Ethernet, InfiniBand, etc.) attached to this adapter.

3.1.2 Software

Software configuration of the node.

3.1.2.1 Adapter Driver

The driver type and level for this adapter.

3.1.2.2 Firmware

The firmware type and level for this device.

3.1.2.3 Operating System

The operating system name and version. If there are patches applied that affect performance, they must be disclosed in the Notes.

3.1.2.4 Local File System

The type of the file system local to each compute node.

3.1.2.5 Shared File System

The type of the file system used to contain the run directories.

3.1.2.6 System State

The state (sometimes called "run level") of the system while the benchmarks were being run. Generally, this is "single user", "multi-user", "default", etc.

3.1.2.7 Other Software

Any performance-relevant non-compiler software used, including third-party libraries, accelerators, etc.

3.1.3 Notes

Free-form notes about other hardware or software details of the node, such as changes to the default operating system state, changes to the default hardware state of the node, or other tuning information.


3.2 Interconnect Description(s)

Description of the configuration of the interconnect.

3.2.1 Vendor

The manufacturer(s) of this interconnect.

3.2.2 Model

The model name(s) of the interconnect as a whole, or components of it -- not including the switch model, which is the next field.

3.2.3 Switch Model(s)

The model and manufacturer of the switching element(s) of this interconnect. There may be more than one kind declared.

3.2.3.1 Number of switches

The number of switches of this type in the interconnect.

3.2.3.2 Ports per switch

The number of ports per switch available for carrying the type of traffic noted in the "Primary Use" field.

3.2.3.3 Data Rate

The per-port, nominal data transfer rate of the adapter.

3.2.3.4 Firmware

The Firmware type and level for the switch(es).

3.2.4 Topology

Description of the arrangement of switches and links in the interconnect.

3.2.5 Primary Use

The kind of data traffic carried by the interconnect: MPI, file server, etc.

3.2.6 Notes

Free-form notes about other hardware or software details of the interconnect.

4. Compilation Description

This section describes how the benchmarks are compiled. The HTML and PDF reports contain links from the settings that are listed, to the descriptions of those settings in the XML flags file report.

Much information is derived from compilation rules written into the config file and interpreted according to rules specified in the XML flags file. Free-form notes can be added to this. Sections only show up if the corresponding flags are used, such as peak optimization flags; otherwise the section is not printed. Section 2 of the MPI2007 Run and Reporting Rules document gives rules on how these items can be used in reportable runs.

4.1 Base & Peak Unknown Flags

This section lists flags, used in the base or peak compiles, that were not recognized by the report generation. Results with unknown flags are marked "invalid" and may not be published.

Likely the flagsurl parameter was not set correctly, or details need to be added to the XML flags file. The "invalid" marking may be removed by reformatting the result using a flags file that describes all of the unknown flags.


4.2 Base & Peak Forbidden Flags

This section lists flags, used in the base or peak compiles, that are designated as Forbidden in the XML flags file for the benchmark or the platform. Results with forbidden flags are marked "invalid" and may not be published.


4.3 Base & Peak Compiler Invocation

This section describes how the compilers are invoked, whether any special paths had to be used or flags were passed, etc.

4.3.1 Notes

Additional free-form notes explaining the compiler usage.


4.4 Base & Peak Portability Flags

This section describes the portability settings that are used to build the benchmarks. Optimization settings are not listed here.

4.4.1 C Benchmarks

Portability settings specific to benchmarks listed.

4.4.2 C++ Benchmarks

Portability settings specific to the benchmarks listed.

4.4.3 Fortran benchmarks

Portability settings specific to the benchmarks listed.

4.4.4 Benchmarks using both Fortran and C

Portability settings specific to the benchmarks listed.

4.4.5 Notes

Additional free-form notes explaining the portability settings.


4.5 Base & Peak Optimization Flags

This section describes the optimizations settings that are used to build the benchmark binaries for the base and peak runs.

4.5.1 C Benchmarks

Optimization settings specific to the C benchmarks.

4.5.2 C++ Benchmarks

Optimization settings specific to the C++ benchmarks.

4.5.3 Fortran benchmarks

Optimization settings specific to the Fortran benchmarks.

4.5.4 Benchmarks using both Fortran and C

Optimization settings specific to the mixed-language benchmarks.

4.5.5 Notes

Additional free-form notes explaining the optimization settings.


4.6 Base & Peak Other Flags

This section describes the other settings that are used to build or run the benchmark binaries for the base and peak runs. These are classified as being neither portability nor optimization settings.

4.6.1 C Benchmarks

Settings specific to the C benchmarks.

4.6.2 C++ Benchmarks

Settings specific to the C++ benchmarks.

4.6.3 Fortran benchmarks

Settings specific to the Fortran benchmarks.

4.6.4 Benchmarks using both Fortran and C

Settings specific to the mixed-language benchmarks.

4.6.5 Notes

Additional free-form notes explaining the other settings.

5. General Notes

This section is where the tester provides notes about things not covered in the other notes sections.


6. Errors

This section is automatically inserted by the benchmark tools when there are errors present that prevent the result from being a valid reportable result.


Copyright © 2007-2009 Standard Performance Evaluation Corporation
All Rights Reserved

W3C XHTML 1.0
W3C CSS

Copyright © 2007-2010 Standard Performance Evaluation Corporation
All Rights Reserved