1.0
Introduction
1.1 Philosophy
1.2 Fair Use of SPECvirt_sc2010 Results
1.3 Research and Academic Usage
1.4 Caveat
1.5 Definitions
2.0 Running the SPECvirt_sc2010 Benchmark
2.1 Environment
2.1.1 Testbed
Configuration
2.1.2 System
Under Test (SUT)
2.1.3 Power
2.2 Workload VMs
2.3 Measurement
2.3.1 Quality
of Service
2.3.2
Benchmark Parameters
2.3.3 Running
SPECvirt_sc2010 Workloads
2.3.4 Power
Measurement
2.3.5 Client Polling
Requirements
3.0 Reporting Results
3.1 Metrics And Reference Format
3.2 Testbed Configuration
3.2.1 SUT
Hardware
3.2.1.1 SUT
Stable Storage
3.2.2 SUT
Software
3.2.3 Network
Configuration
3.2.4 Clients
3.2.5 General
Availability Dates
3.2.6 Rules on
the Use of Open Source Applications
3.2.7 Test
Sponsor
3.2.8 Notes
4.0 Submission Requirements for SPECvirt_sc2010
4.1 SUT Configuration Collection
4.2 Guest Configuration Collection
4.3 Client Configuration Collection
4.4 Configuration Collection Archive Format
5.0 The SPECvirt_sc2010 Benchmark Kit
Appendix A. Run Rules References
SPECvirt_sc2010 is
the first generation SPEC benchmark for evaluating the performance of
datacenter servers used for virtualized server consolidation. The
benchmark also provides options for measuring and reporting power and
performance/power metrics. This document specifies how SPECvirt_sc2010 is to be
run for measuring and publicly reporting performance and power results. These
rules abide by the norms laid down by the SPEC Virtualization Subcommittee and
approved by the SPEC Open Systems Steering Committee. This ensures that results
generated with this suite are meaningful, comparable to other generated
results, and are repeatable with sufficient documentation covering factors
pertinent to duplicating the results.
Per the SPEC license
agreement, all results publicly disclosed must adhere to these Run and
Reporting Rules.
The general
philosophy behind the rules of SPECvirt_sc2010 is to ensure that an independent
party can reproduce the reported results.
The following
attributes are expected:
Furthermore, SPEC
expects that any public use of results from this benchmark suite shall be for
System Under Test (SUT) and configurations that are appropriate for public
consumption and comparison. Thus, it is also expected that:
SPEC requires that
any public use of results from this benchmark follow the SPEC Fair Use Rule and those
specific to this benchmark (see the Fair Use section below).
In the case where it appears that these guidelines have not been adhered to,
SPEC may investigate and request that the published material be corrected.
Consistency and fairness are guiding principles for SPEC. To
help assure that these principles are met, any organization or individual who
makes public use of SPEC benchmark results must do so in accordance with the
SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html. Fair-use
clauses specific to SPECvirt_sc2010 are covered in http://www.spec.org/fairuse.html#SPECvirt_sc2010.
Please consult the SPEC FairUse
Rule on Research and Academic Usage at http://www.spec.org/fairuse.html#Academic.
SPEC reserves the
right to adapt the benchmark codes, workloads, and rules of SPECvirt_sc2010 as
deemed necessary to preserve the goal of fair benchmarking. SPEC will notify
members and licensees whenever it makes changes to this document and will
rename the metrics if the results are no longer comparable.
Relevant standards
are cited in these run rules as URL references, and are current as of the date
of publication. Changes or updates to these referenced documents or URLs may
necessitate repairs to the links and/or amendment of the run rules. The most
current run rules will be available at the SPEC Web site at http://www.spec.org/virt_sc2010.
SPEC will notify members and licensees whenever it makes changes to the suite.
In a virtualized
environment, the definitions of commonly-used terms can have multiple or
different meanings. To avoid ambiguity, this section attempts to define terms
that are used throughout this document:
For further
definition or explanation of workload-specific terms, refer to the respective
documents of the original benchmarks.
These requirements
apply to all hardware and software components used in producing the benchmark
result, including the System under Test (SUT), network, and clients.
For a run to be
valid, the following attributes must hold true:
This section outlines
some of the environmental and other electrical requirements related to power
measurement while running the benchmark. Note that power measurement is
optional, so this section only applies to results with power in the
Performance/Power categories.
To produce a compliant result for either Performance/Power of the Total
System Under Test (SPECvirt_sc2010_PPW) or Performance/Power of the Server
only, (SPECvirt_sc2010_ServerPPW) the following requirements must be met in
addition to the environmental and electrical requirements described in this
section.
The preferred Line
Voltage source used for measurements is the main AC power as provided by local
utility companies. Power generated from other sources often has unwanted
harmonics which are incapable of being measured correctly by many power
analyzers, and thus would generate inaccurate results.
The usage of an
uninterruptible power source (UPS) as the line voltage source is allowed, but
the voltage output must be a pure sine-wave. For placement of the UPS, see Power Analyzer Setup below. This usage must be
specified in the Notes section of the report.
Systems that are designed to be able to run normal operations without an
external source of power cannot be used to produce valid results. Some examples
of disallowed systems are notebook computers, hand-held computers/communication
devices and servers that are designed to frequently operate on integrated
batteries without external power.
Systems with batteries intended to preserve operations during a temporary
lapse of external power, or to maintain data integrity during an orderly
shutdown when power is lost, can be used to produce valid benchmark results.
For SUT components that have an integrated battery, the battery must be fully
charged at the end of the measurement interval, or
proof must be provided that it is charged at least to the level of charge at
the beginning of the interval.
Note that integrated
batteries that are intended to maintain such things as durable cache in a storage
controller can be assumed to remain fully charged. The above paragraph is
intended to address “system” batteries that can provide primary power for the
SUT.
DC line voltage
sources are currently not supported.
For situations in which the appropriate voltages are not provided by local
utility companies (e.g. measuring a server in the United States which is
configured for European markets, or measuring a server in a location where the
local utility line voltage does not meet the required characteristics), an AC
power source may be used, and the power source must be specified in the notes
section of the disclosure report. In such situation the following requirements
must be met, and the relevant measurements or power source specifications
disclosed in the general notes section of the disclosure report:
SPEC requires that
power measurements be taken in an environment representative of the majority of
usage environments. The intent is to discourage extreme environments that may
artificially impact power consumption or performance of the server.
The following
environmental conditions must be met:
The power analyzer
must be located between the AC Line Voltage Source and the SUT. No other active
components are allowed between the AC Line Voltage Source and the SUT. If the
SUT consists of several discrete parts (server and storage), separate power
analyzers may be required.
Power analyzer
configuration settings that are set by the SPEC PTDaemon must not be manually
overridden.
Power Analyzer Specifications
To ensure
comparability and repeatability of power measurements, SPEC requires the
following attributes for the power measurement device used during the
benchmark. Please note that a power analyzer may meet these requirements when
used in some power ranges but not in others, due to the dynamic nature of power
analyzer Accuracy and Crest Factor.
For example:
An analyzer with a vendor-specified uncertainty of +/- 0.5% of reading
+/- 4 digits, used in a test with a maximum wattage value of 200W, would have
"overall" uncertainty of (((0.5%*200W)+0.4W)=1.4W/200W) or 0.7% at
200W.
An analyzer with a wattage range 20-400W, with a vendor-specified
uncertainty of +/- 0.25% of range +/- 4 digits, used in a test with a
maximum wattage value of 200W, would have "overall" uncertainty of
(((0.25%*400W)+0.4W)=1.4W/200W) or 0.7% at 200W.
Temperature
Sensor Specifications
Temperature must be
measured no more than 50mm in front of (upwind of) the main airflow inlet of
the server.
To ensure comparability and repeatability of temperature measurements, SPEC
requires the following attributes for the temperature measurement device used
during the benchmark:
Supported and Compliant
Devices
See the Device List
for a list of currently supported (by the benchmark software) and compliant (in
specifications) power analyzers and temperature sensors.
A tile is a single unit of work that is
comprised of six distinct virtual machines and supports all the component
workloads. Additional tiles are used to scale the benchmark. The
last tile may be configured as a "fractional tile", which means a
"load scale factor" of less than 1.0 is applied to all of the VMs
within that tile. If used, the load scale factor must be between 0.1 and
0.9 in 0.1 increments (e.g. 0.25 would not be allowed). Each VM is required to
be a distinct entity; for example, you cannot run the application server and
the database on the same VM. The following block diagram shows the tile
architecture and the virtual machine/hypervisor/driver relationships:
Figure 1. Tile block diagram
Note that there are more
virtual machines than client drivers; this is because the Infrastructure Server
and Database Server VMs do not interact directly with the client.
Specifically, the Web Server VM must access parts of its fileset and the
backend simulator (BeSim) via inter-VM communication to the Infrastructure
Server. Similarly, the Application Server VM accesses the Database Server
VM via inter-VM communication.
The operating systems may vary between
virtual machines within a tile. All specific workload VMs (guest OS type
and application software) across all tiles must be identical, including
fractional tiles. Examples of parameters that must remain identical
include:
·
Guest OS distribution, version, and patch
levels
·
Application software version and patch levels
·
Guest OS and application software tunings
·
VM resource parameters from the guest OS
perspective (i.e. # CPUs, memory, networking/storage configuration)
The intent is that workload-specific VMs
across tiles are "clones", with only the modifications necessary to
identify them as different entities (e.g. host name and network address).
As
Internet email is defined by its protocol definitions, the mail server requires
adherence to the relevant protocol standards:
RFC 2060 : Internet Mail Application
Protocol - Version 1 (IMAP4)
The IMAP4 protocol
implies the following:
RFC 791 : Internet
Protocol (IPv4)
RFC 792 :
Internet Control Message Protocol (ICMP)
RFC 793 :
Transmission Control Protocol (TCP)
RFC 950 :
Internet Standard Subnetting Procedure
RFC 1122 :
Requirements for Internet Hosts - Communication Layers
Internet standards
are evolving standards. Adherence to related RFC's (e.g. RFC 1191 Path MTU Discovery) is
also acceptable provided the implementation retains the characteristic of
interoperability with other implementations.
The J2EE server must
provide a runtime environment that meets the requirements of the Java 2
Platform, Enterprise Edition, (J2EE) Version 1.3 or later specifications during
the benchmark run.
A major new version
(i.e. 1.0, 2.0, etc.) of a J2EE server must have passed the J2EE Compatibility
Test Suite (CTS) by the product's general availability date.
A J2EE Server that
has passed the J2EE Compatibility Test Suite (CTS) satisfies the J2EE
compliance requirements for this benchmark regardless of the underlying
hardware and other software used to run the benchmark on a specific
configuration, provided the runtime configuration options result in behavior
consistent with the J2EE specification. For example, using an option that
violates J2EE argument passing semantics by enabling a pass-by-reference
optimization, would not meet the J2EE compliance requirement.
Comment: The intent of this requirement is to ensure that the J2EE server is a complete implementation satisfying all requirements of the J2EE specification and to prevent any advantage gained by a server that implements only an incomplete or incompatible subset of the J2EE specification.
SPECvirt_sc2010 requires that each Application Server VM execute it own
locally installed emulator application (emulator.EAR). This differs from the
original SPECjAppServer2004 workload definition.
All tables must have
the properly scaled number of rows as defined by the database population
requirements, as defined in the "Application and Database Server
Benchmark" section of the SPECvirt_sc2010 Design
Overview.
Additional database
objects or DDL modifications made to the reference schema scripts in the schema/sql
directory in the SPECjAppServer2004 Kit must be disclosed along with the specific
reason for the modifications. The base tables and indexes in the reference
scripts cannot be replaced or deleted. Views are not allowed. The data types of
fields can be modified provided they are semantically equivalent to the
standard types specified in the scripts.
Comment: Replacing CHAR with VARCHAR would be considered semantically equivalent. Changing the size of a field (for example: increasing the size of a char field from 8 to 10) would not be considered semantically equivalent. Replacing CHAR with INTEGER (for example: zip code) would not be considered semantically equivalent.
Modifications that a
customer may make for compatibility with a particular database server are
allowed. Changes may also be necessary to allow the benchmark to run without
the database becoming a bottleneck, subject to approval by SPEC. Examples of
such changes include:
Comment: Schema scripts provided
by the vendors in the schema/<vendor> directories are for
convenience only. They do not constitute the reference or baseline scripts in
the schema/sql directory. Deviations from the scripts in the schema/sql
directory must still be disclosed in the submission file even though the
vendor-provided scripts were used directly.
In any committed
state the primary key values must be unique within each table. For example, in
the case of a horizontally partitioned table, primary key values of rows across
all partitions must be unique.
The databases must be populated using the supplied load programs or
restored from a database copy in a correctly populated state that was populated
using the supplied load programs prior to the start of each benchmark run.
Modifications to the
load programs are permitted for porting purposes. All such modifications made
must be disclosed in the Submission File.
As the WWW is defined
by its interoperative protocol definitions, the Web server requires adherence
to the relevant protocol standards. It is expected that the Web server is HTTP
1.1 compliant. The benchmark environment shall be governed by the following
standards:
For further explanation of these protocols, the following might be helpful:
The current text of
all IETF RFC's may be obtained from: http://ietf.org/rfc.html
All marketed standards that a software product states as being adhered to must
have passed the relevant test suites used to ensure compliance with the
standards.
For a run to be
valid, the following attributes must hold true:
The Infrastructure VM
has the same requirements as the Web Server VM in its role as a web back-end
(BeSim) for the web workload.
It also hosts the download files for the webserver using a file system protocol
for remote file sharing (for example NFS or CIFS).
For a run to be
valid, each idle server VM must have at least 512 MB of memory allocated.
The operating system of the idle server VM must be of the same type and version
as at least one other VM in the tile. The idle server VM does not need to
contain the other VM's workload-specific application software stack. The intent
of these requirements is to prohibit vendors from artificially limiting and
tuning in order to take advantage of the idle server's limited functionality.
The SPECvirt_sc2010
individual workload metrics represent the aggregate throughput that a server
can support while meeting quality of service (QoS) and validation
requirements. In the benchmark run, one or more tiles are run
simultaneously. The load generated is based on page requests, database
transactions, and IMAP operations as defined in the SPECvirt_sc2010
Design Overview.
The QoS requirements
are relative to the individual workloads. These include:
The load generated is based on page requests, transition between pages and
the static images accessed within each page.
The QoS requirements are defined in terms of two parameters, Time_Good and
Time_Tolerable. QoS requirements are page based, Time_Good and Time_Tolerable
values are defined as 3 seconds and 5 seconds respectively. For each page, 95% of
the page requests (including all the embedded files within that page) are
expected to be returned within Time_Good and 99% of the requests within
Time_Tolerable. Very large static files (i.e. Support downloads) use
specific byte rates as their QoS requirements.
The validation requirement is such that less than 1% of requests for any
given page and less than 0.5% of all the page requests in a given test
iteration fail validation.
It is required in this benchmark that all user sessions be run at the
"high-speed Internet" speed of 100 kilobytes/sec.
In addition, the URL retrievals (or operations) performed must also meet
the following quality criteria:
For each IMAP operation type, 95% of all transactions
must complete within five seconds. Additionally for each IMAP operation type,
there may be no more than 1.5% failures (where a failure is defined as
transactions that return unexpected content, or time-out). The total failure
count across all operation types must be no more than 1% of the count of all
operations.
The client polls the Idle Server periodically to ensure
that the VM is running and responsive. To meet the Idle Server QoS requirement,
99.5% of all polling requests must be responded to within one second.
Business Transactions are selected by the Driver based on the mix shown in the
following table. The actual mix achieved in the benchmark must be within 5% of
the targeted mix for each type of Business Transaction. For example, the browse
transactions can vary between 47.5% to 52.5% of the total mix. The Driver
checks and reports on whether the mix requirement was met.
Business Transaction Mix Requirements |
|
Business Transaction Type |
Percent Mix |
Purchase |
25% |
Manage |
25% |
Browse |
50% |
The Driver measures and records the Response Time of the different types of
Business Transactions. Only successfully completed Business Transactions in the
Measurement Interval are included. At least 90% of the Business Transactions of
each type must have a Response Time of less than the constraint specified in
the table below. The average Response Time of each Business Transaction's type
must not be greater than 0.1 seconds more than the 90% Response Time. This
requirement ensures that all users will see reasonable response times. For
example, if the 90% Response Time of purchase transactions is 1 second, then
the average cannot be greater than 1.1 seconds. The Driver checks and reports
on whether the response time requirements were met.
Response Time Requirements |
|
Business Transaction Type |
90% RT (in seconds) |
Purchase |
2 |
Manage |
2 |
Browse |
2 |
For each Business Transaction, the Driver selects cycle times from a
negative exponential distribution, computed from the following equation:
Tc = -ln(x) * 10
where:
Tc = Cycle Time
ln = natural log (base e)
x = random number with at least 31 bits of precision,
from a uniform distribution such that (0 < x <= 1)
The distribution is truncated at 5 times the mean. For each Business
Transaction, the Driver measures the Response Time Tr and computes the
Delay Time Td as Td = Tc - Tr. If Td > 0, the Driver
will sleep for this time before beginning the next Business Transaction. If the
chosen cycle time Tc is smaller than Tr, then the actual cycle
time (Ta) is larger than the chosen one.
The average actual cycle time is allowed to deviate from the targeted one
by 5%. The Driver checks and reports on whether the cycle time requirements
were met.
The table below shows the range of values allowed for various quantities in
the application. The Driver will check and report on whether these requirements
were met.
Miscellaneous Dealer Requirements |
|||
Quantity |
Targeted Value |
Min. Allowed |
Max. Allowed |
Average Vehicles per Order |
26.6 |
25.27 |
27.93 |
Vehicle Purchasing Rate (/sec) |
6.65
* Ir |
6.32*
Ir |
6.98
* Ir |
Percent Purchases that are Large
Orders |
10 |
9.5 |
10.5 |
Large Order Vehicle Purchasing Rate (/sec) |
3.5
* Ir |
3.33
* Ir |
3.68
* Ir |
Average # of Vehicles per Large Order |
140 |
133 |
147 |
Regular Order Vehicle Purchasing Rate
(/sec) |
3.15
* Ir |
2.99
* Ir |
3.31
* Ir |
Average # of Vehicles per Regular
Order |
14 |
13.3 |
14.7 |
The metric for the Dealer Domain is Dealer Transactions/sec,
composed of the total count of all Business Transactions successfully completed
during the measurement interval divided by the length of the measurement
interval in seconds.
The M_Driver measures and records the time taken for a work order to
complete. Only successfully completed work orders in the Measurement Interval
are included. At least 90% of the work orders must have a Response Time of less
than 5 seconds. The average Response Time must not be greater than 0.1 seconds
more than the 90% Response Time.
The table below shows the range of values allowed for various quantities in
the Manufacturing Application. The M_Driver will check and report on whether
the run meets these requirements.
Miscellaneous Manufacturing Requirements |
|||
Quantity |
Targeted Value |
Min. Allowed |
Max. Allowed |
LargeOrderline
Widget Rate/sec |
3.5 * Ir |
3.15 * Ir |
3.85 * Ir |
Planned Line Widget Rate/sec |
3.15 * Ir |
2.835 * Ir |
3.465 * Ir |
Workload-specific
configuration files are supplied with the harness. All configurable parameters are
listed in these files. For a run to be valid, all the parameters in the
configuration files must be left at default values, except for the ones that
are marked and listed clearly as "Configurable Workload Properties".
To configure the
initial benchmark environment from scratch, the benchmarker:
To run the benchmark,
the benchmarker must:
NOTE:
This section is only applicable to results that have power measurement, which
is optional.
The measurement of
power should meet all the environmental aspects listed in Environmental Conditions. The
SPECvirt_sc2010 benchmark tools provide the ability to automatically gather
measurement data from supported power analyzers and temperature sensors and integrate
that data into the benchmark result. SPEC requires that the analyzers and
sensors used in a submission be supported by the measurement framework. The
provided tools (or a newer version provided by SPEC) must be used to run and
produce measured SPECvirt_sc2010 results.
The primary metrics,
SPECvirt_sc2010_PPW (performance with SUT power) and
SPECvirt_sc2010_ServerPPW (performance with Server only power) are
performance per watt metrics obtained by dividing the peak performance by the
peak power of the SUT or Server, respectively, during the run measurement
phase. For example, if the SPECvirt_sc2010 result consisted of a maximum
of 6 tiles, the power would be calculated as the average power while serving
transactions within all 6 workload tiles.
During the measurement phase, the SPECvirt_sc2010 prime controller polls
each prime client process associated with each workload in each tile once every
10 seconds. The prime controller collects and records the workload polling data
which includes performance and QoS measurement data from the clients. It
is expected that in a compliant run all polling requests will be responded to
within 10 seconds (BEAT_INTERVAL). Failure to respond to polling requests may
indicate problems with the clients' ability to issue and respond to workload
requests in a timely manner or accurately record performance.
The prime controller process will detect that each polling request is
responded to by the prime client processes, the prime controller will
invalidate the test if more than one 10-second polling interval is missed
during the test's measurement phase. The test will abort, and the run
will be marked as non-compliant.
The reported performance
metric, SPECvirt_sc2010, appears in both Performance/Power and Performance
categories, and will be derived from a set of compliant results from the
workloads in the suite:
The SPECvirt_sc2010
metric is a "supermetric" that is the arithmetic mean of the
normalized submetrics for each workload. The metric will be output in the
format "SPECvirt_sc2010 <score> @ <# vms> VMs".
The optional
reported performance/watt metrics, SPECvirt_sc2010_PPW and SPECvirt_sc2010_ServerPPW,
represents the peak performance divided by the average power of the SUT and
server respectively during the peak run phase. These metrics will only
appear in results in the Performance/Power categories, and the result must not
be compared with results that do not have power measured. These metrics
will be output in the format "SPECvirt_sc2010_PPW <score> @ <#
vms> VMs" and "SPECvirt_sc2010_ServerPPW <score> @ <#
vms> VMs"
Please consult the SPEC Fair Use Rule on the treatment of
estimates at http://www.spec.org/fairuse.html#SPECvirt_sc2010.
The report of results for the
SPECvirt_sc2010 benchmark is generated in ASCII and HTML format by the provided
SPEC tools. These tools may not be changed without prior SPEC approval. The
tools perform error checking and will flag some error conditions as resulting
in an "invalid run". However, these automatic checks are only
there for debugging convenience, and do not relieve the benchmarker of the
responsibility to check the results and follow the run and reporting rules.
SPEC reviews and
accepts for publication on SPEC's website only a complete and compliant set of
results run and reported according to these rules. Full disclosure
reports of all test and configuration details as described in these run and
report rules must be made available. Licensees are encouraged to submit
results to SPEC for publication.
All system
configuration information required to duplicate published performance results
must be reported. Tunings not in the default configuration for software and
hardware settings must be reported. All tiles must be tuned identically.
The following SUT
hardware components must be reported:
The SUT must utilize stable storage. Additionally, the SUT must use stable
and durable storage for all virtual machines (including all corresponding data
drives), such that a single drive failure does not incur data loss on the VMs.
For example: RAID-1, 5, 10, 50, 0+1 are acceptable RAID levels, but RAID-0
(striping without mirroring or parity) is not considered durable.
The SUT
The hypervisor must
be able to recover the virtual machines, and the virtual machines must also be
able to recover their data sets, without loss from multiple power failures
(including cascading power failures), hypervisor and guest operating system
failures, and hardware failures of components (e.g. CPU) other than the storage
medium. At any point where the data can be cached, after any virtual server has
accepted the message and acknowledged a transaction, there must be a mechanism
to ensure any cached data survives the server failure.
If an UPS is required
by the SUT to meet the stable storage requirement, the benchmarker is not
required to perform the test with an UPS in place. The benchmarker must
state in the disclosure that an UPS is required. Supplying a model number for
an appropriate UPS is encouraged but not required.
If a battery-backed
component is used to meet the stable storage requirement, that battery
must have sufficient power to maintain the data for at least 48 hours to allow
any cached data to be committed to media and the system to be gracefully shut
down. The system or component must also be able to detect a low battery
condition and prevent the use of the caching feature of the component or
provide for a graceful system shutdown.
Hypervisors are required to safely store all completed transactions to its
virtualized workloads (including failure of the hypervisor's own storage):
The following SUT
software components must be reported:
A brief description
of the network configuration used to achieve the benchmark results is required.
The minimum information to be supplied is:
The following load
generator hardware components must be reported:
The dates of general
customer availability must be listed for the major components: hardware,
software (hypervisor, operating systems, and applications), month and year. All
the system, hardware and software features are required to be generally
available on or before date of publication, or within 3 months of the date of
publication (except where precluded by these rules, see section 3.2.7). With multiple
components having different availability dates, the latest availability date
must be listed.
Products are
considered generally available if they are orderable by ordinary customers and
ship within a reasonable time frame. This time frame is a function of the
product size and classification, and common practice. The availability of
support and documentation for the products must coincide with the release of
the products.
Hardware products
that are still supported by their original or primary vendor may be used if
their original general availability date was within the last five years. The
five-year limit is waived for hardware used in clients.
For ease and cost of
benchmarking, storage and networking hardware external to the server such as
disks, storage enclosures, storage controllers and network switches, which were
generally available within the last five years but are no longer available from
the original vendor, may be used. If such end-of-life (and possibly
unsupported) hardware is used, then the test sponsor represents that the
performance measured is no better than 105% of the performance on hardware
available as of the date of publication. The product(s) and their end-of-life
date(s) must be noted in the disclosure. If it is later determined that the
performance using available hardware to be lower than 95% of that reported, the
result shall be marked non-compliant
(NC).
Software products
that are still supported by their original or primary vendor may be used if
their original general availability date was within the last three years.
In the disclosure,
the benchmarker must identify any component that is no longer orderable by
ordinary customers.
If pre-release
hardware or software is tested, then the test sponsor represents that the
performance measured is generally representative of the performance to be
expected on the same configuration of the release system. If it is later
determined that the performance using available hardware or software to be
lower than 95% of that reported, the result shall be marked non-compliant
(NC).
SPECvirt_sc2010 does
permit Open Source Applications outside of a commercial distribution or support
contract with some limitations. The following are the rules that govern the
admissibility of the Open Source Application in the context of a benchmark run
or implementation. Open Source Applications do not include shareware and
freeware, where the source is not part of the distribution.
Submission date |
Beginning of time window |
Aug
20, 2019 |
Feb
20, 2019 |
Jul
20, 2019 |
Jan
20, 2019 |
Jun
20, 2019 |
Dec
20 2018 |
Note: The Webserver workload requires the use of Smarty 2.6.26 which is
included in the release kit and is not subject to the above rules.
The reporting page
must list the date the test was performed, month and year, the organization
which performed the test and is reporting the results, and the SPEC license
number of that organization.
This section is used
to document:
Once you have a
compliant run and wish to submit it to SPEC for review, you will need to
provide the following:
Once you have the
submission ready, please email SPECvirt_sc2010 submissions to subvirt_sc2010@spec.org.
In order to publicly
disclose SPECvirt_sc2010 results, the submitter must adhere to these reporting
rules in addition to having followed the run rules described in this document.
The goal of the reporting rules is to ensure the system under test is
sufficiently documented such that someone could reproduce the test and its
results.
Compliant runs need
to be submitted to SPEC for review and must be accepted prior to public
disclosure. If public statements using SPECvirt_sc2010 are made
they must follow the SPEC Fair Use Rule (http://www.spec.org/fairuse.html).
Many other SPEC
benchmarks allow duplicate submissions for a single system sold under various
names. Each SPECvirt_sc2010 result from a power enabled run submitted to SPEC
or made public must be for an actual run of the benchmark on the SUT named in
the result. Electrically equivalent submissions for power enabled runs are not
allowed, unless it is also mechanically equivalent (e.g. rebadged).
The submitter is
required to run a script that will collect available configuration details of the
SUT and all the virtual machines used for the benchmark, including:
The primary reason
for this step is to ensure that there are not subtle differences that the
vendor may miss.
The submitter is required to run a script which provides the details
of each VM, its operating system and application tunings that is not captured in
the SUT configuration collection script including:
During a review of the result, the submitter may be required to provide,
upon request, additional details of the VM, operating system and
application tunings and log files that may not be captured in the above script.
These may include, but are not limited to:
The primary reason for this step is to ensure that the vendor has disclosed
all non-default tunings.
The submitter is required to run a script which collects the details of
each type or uniquely configured physical and virtual client used, such that
the testbed's client configuration could be reproduced. Clients that are
clones of a specific and documented type may be identified and data collection
is encouraged but not required. The client collection script should
collect files and output of commands to document the client configuration
and tuning details including:
During a review of the result, the submitter may be required to provide,
upon request, additional details of the client configuration that
may not be captured in the above script to help document details relevant to questions
that may arise during the review.
The submitter must submit the Configuration Collection Archive containing
the data (files and command ouput) described sections 4.0, 4.1.and 4.2 above
using the highlevel directory structure described below as the foundation:
SPEC provides client
driver software, which includes tools for running the benchmark and reporting
its results. The client drivers are written in Java; precompiled class
files are included with the kit, so no build step is necessary. Recompilation
of the client driver software is not allowed, unless prior approval from SPEC
is given.
This software
implements various checks for conformance with these run and reporting rules;
therefore, the SPEC software must be used as provided. Source code
modifications are not allowed, unless prior approval from SPEC is given. Any
such substitution must be reviewed and deemed "performance-neutral"
by the OSSC.
The kit also includes
source code for the file set generators, script code for the web server, and
other necessary components.
SPECvirt_sc2010 uses
modified versions of SPECweb2005, SPECjAppServer2004, and SPECmail2008 for its
virtualized workloads. For reference, the run rules for those benchmarks are
listed below:
NOTE: Not all of these run rules
are applicable to SPECvirt, but when a compliance issue is raised, SPEC
reserves the right to refer back to these individual benchmarks' run rules as
needed for clarification.
Copyright
© 2011 Standard Performance Evaluation Corporation. All rights reserved.
Java® is a
registered trademark of Oracle Corporation.