SPEC® MPIM2007 Result

Copyright 2006-2010 Standard Performance Evaluation Corporation

Intel Corporation

Endeavor (Intel Xeon X5560, 2.80 GHz,
DDR3-1333 MHz, SMT off, Turbo on)

SPECmpiM_peak2007 = Not Run

MPI2007 license: 13 Test date: Jul-2009
Test sponsor: Intel Corporation Hardware Availability: Jun-2009
Tested by: Pavel Shelepugin Software Availability: Jun-2009
Benchmark results graph

Results Table

Benchmark Base Peak
Ranks Seconds Ratio Seconds Ratio Seconds Ratio Ranks Seconds Ratio Seconds Ratio Seconds Ratio
Results appear in the order in which they were run. Bold underlined text indicates a median measurement.
104.milc 256 45.9 34.1 42.3 37.0
107.leslie3d 256 137   38.2 137   38.1
113.GemsFDTD 256 301   21.0 300   21.0
115.fds4 256 59.0 33.1 59.4 32.8
121.pop2 256 165   25.0 164   25.1
122.tachyon 256 90.6 30.9 90.9 30.8
126.lammps 256 139   20.9 138   21.1
127.wrf2 256 107   72.9 109   71.4
128.GAPgeofem 256 45.7 45.2 46.2 44.7
129.tera_tf 256 89.8 30.8 89.9 30.8
130.socorro 256 89.8 42.5 89.8 42.5
132.zeusmp2 256 73.6 42.2 73.8 42.1
137.lu 256 59.7 61.6 59.8 61.4
Hardware Summary
Type of System: Homogeneous
Compute Node: Endeavor Node
Interconnect: IB Switch
File Server Node: LFS
Total Compute Nodes: 32
Total Chips: 64
Total Cores: 256
Total Threads: 256
Total Memory: 768 GB
Base Ranks Run: 256
Minimum Peak Ranks: --
Maximum Peak Ranks: --
Software Summary
C Compiler: Intel C++ Compiler 11.1 for Linux
C++ Compiler: Intel C++ Compiler 11.1 for Linux
Fortran Compiler: Intel Fortran Compiler 11.1 for Linux
Base Pointers: 64-bit
Peak Pointers: 64-bit
MPI Library: Intel MPI Library 3.2 for Linux
Other MPI Info: None
Pre-processors: No
Other Software: None

Node Description: Endeavor Node

Hardware
Number of nodes: 32
Uses of the node: compute
Vendor: Intel
Model: SR1600UR
CPU Name: Intel Xeon X5560
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 8
Cores per chip: 4
Threads per core: 1
CPU Characteristics: Intel Turbo Boost Technology up to 3.2 GHz,
6.4 GT/s QPI, Hyper-Threading disabled
CPU MHz: 2800
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 256 KB I+D on chip per core
L3 Cache: 8 MB I+D on chip per chip, 8 MB shared / 4 cores
Other Cache: None
Memory: 24 GB (RDIMM 6x4-GB DDR3-1333 MHz)
Disk Subsystem: Seagate 400 GB ST3400755SS
Other Hardware: None
Adapter: Mellanox MHQH29-XTC
Number of Adapters: 1
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x QDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MHQH29-XTC
Adapter Driver: OFED 1.3.1
Adapter Firmware: 2.6.000
Operating System: Red Hat EL 5.2, kernel 2.6.18-128
Local File System: Linux/ext2
Shared File System: Lustre FS
System State: Multi-User
Other Software: PBS Pro 8.0

Node Description: LFS

Hardware
Number of nodes: 8
Uses of the node: fileserver
Vendor: Intel
Model: SR1560SF
CPU Name: Intel Xeon E5462
CPU(s) orderable: 1-2 chips
Chips enabled: 2
Cores enabled: 8
Cores per chip: 4
Threads per core: 1
CPU Characteristics: 1600 MHz FSB
CPU MHz: 2800
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 12 MB I+D on chip per chip, 6 MB shared / 2 cores
L3 Cache: None
Other Cache: None
Memory: 16 GB DDR2 16x1-GB 667 MHz
Disk Subsystem: Seagate 250 GB
Other Hardware: connected to DDN storage (see General Notes)
Adapter: Mellanox MHGH28-XTC
Number of Adapters: 1
Slot Type: PCIe x8 Gen2
Data Rate: InfiniBand 4x DDR
Ports Used: 1
Interconnect Type: InfiniBand
Software
Adapter: Mellanox MHGH28-XTC
Adapter Driver: OFED 1.3.1
Adapter Firmware: 2.6.000
Operating System: Red Hat EL 5.2, kernel 2.6.18-53
Local File System: None
Shared File System: Lustre FS
System State: Multi-User
Other Software: None

Interconnect Description: IB Switch

Hardware
Vendor: Mellanox
Model: Mellanox MTS3600Q-1UNC
Switch Model: Mellanox MTS3600Q-1UNC
Number of Switches: 46
Number of Ports: 36
Data Rate: InfiniBand 4x QDR
Firmware: 7.1.000
Topology: Fat tree
Primary Use: MPI traffic, FS traffic

Submit Notes

The config file option 'submit' was used.

General Notes

 MPI startup command:
   mpiexec command was used to start MPI jobs. This command uses
   an independent ring of mpd daemons, which is started beforehand via
   mpdboot command. mpdboot was launched only once, and the corresponding
   ring of daemons was used for every iteration of each SPEC MPI component.
   So, the startup and tear-down time of the daemons was not included to
   the elapsed time and thus was not taken into account during calculation
   of the ratio.

 BIOS settings:
   Intel Hyper-Threading Technology (SMT): Disabled (default is Enabled)
   Intel Turbo Boost Technology (Turbo)  : Enabled (default is Enabled)

 RAM configuration:
   Compute nodes have 1x4-GB RDIMM on each memory channel.

 Network:
   Forty six 36-port switches: 18 core switches and 28 leaf switches.
   Each leaf has one link to each core. Remaining 18 ports on 25 of 28 leafs
   are used for compute nodes. On the remaining 3 leafs the ports are used
   for FS nodes and other peripherals.

 Job placement:
   Each MPI job was assigned to a topologically compact set of nodes, i.e.
   the minimal needed number of leaf switches was used for each job: 1 switch
   for 16/32/64/128 ranks, 2 switches for 256 ranks, 4 switches for 512 ranks.

 Fileserver:
   Intel SR1560SF systems connected via IB to DataDirect Networks S2A9900
   storage which is: 160 disks, 300GB/disk, 48TB total, 35TB available.

 PBS Pro was used for job submission. It has no impact on performance.
   Can be found at: http://www.altair.com

 Lustre File System 1.6.6 was used. Download from:
   http://www.sun.com/software/products/lustre

Base Compiler Invocation

C benchmarks:

 mpiicc 

C++ benchmarks:

126.lammps:  mpiicpc 

Fortran benchmarks:

 mpiifort 

Benchmarks using both Fortran and C:

 mpiicc   mpiifort 

Base Portability Flags

121.pop2:  -DSPEC_MPI_CASE_FLAG 
126.lammps:  -DMPICH_IGNORE_CXX_SEEK 
127.wrf2:  -DSPEC_MPI_CASE_FLAG   -DSPEC_MPI_LINUX 

Base Optimization Flags

C benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

C++ benchmarks:

126.lammps:  -O3   -xSSE4.2   -no-prec-div 

Fortran benchmarks:

 -O3   -xSSE4.2   -no-prec-div 

Benchmarks using both Fortran and C:

 -O3   -xSSE4.2   -no-prec-div 

The flags file that was used to format this result can be browsed at
http://www.spec.org/mpi2007/flags/EM64T_Intel111_flags.html.

You can also download the XML flags source by saving the following link:
http://www.spec.org/mpi2007/flags/EM64T_Intel111_flags.xml.