SPECvirt_sc2013 is the second generation
SPEC benchmark for evaluating the performance of datacenter servers
used for virtualized server consolidation. The benchmark also
provides options for measuring and reporting power and
performance/power metrics. This document specifies how SPECvirt_sc2013
is to be run for measuring and publicly reporting performance and
power results. These rules abide by the norms laid down by the SPEC
Virtualization Subcommittee and approved by the SPEC Open Systems
Steering Committee. This ensures that results generated with this
suite are meaningful, comparable to other generated results, and are
repeatable with sufficient documentation covering factors pertinent to
duplicating the results.
Per the SPEC license agreement, all results
publicly disclosed must adhere to these Run and Reporting Rules.
The general philosophy behind the rules of
SPECvirt_sc2013 is to ensure that an independent party can reproduce
the reported results.
The following attributes are expected:
Furthermore, SPEC expects that any public
use of results from this benchmark suite shall be for the SUT (System
Under Test) and configurations that are appropriate for public
consumption and comparison. Thus, it is also expected that:
SPEC requires that any public use of
results from this benchmark follow the SPEC
Fair Use Rule and those specific to this benchmark (see the 1.2 Fair Use section below). In the case where it appears that
these guidelines have not been adhered to, SPEC may investigate and
request that the published material be corrected.
Consistency and fairness are guiding principles for
SPEC. To help ensure that these principles are met, any organization or
individual who makes public use of SPEC benchmark results must do so in
accordance with the SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html.
Fair Use clauses specific to SPECvirt_sc2013 are covered in http://www.spec.org/fairuse.html#SPECvirt_sc2013.
Please consult the SPEC FairUse
Rule on Research and Academic Usage at http://www.spec.org/fairuse.html#Academic.
SPEC reserves the right to adapt the
benchmark codes, workloads, and rules of SPECvirt_sc2013 as deemed
necessary to preserve the goal of fair benchmarking. SPEC notifies
members and licensees whenever it makes changes to this document and
renames the metrics if the results are no longer comparable.
Relevant standards are cited in these run
rules as URL references, and are current as of the date of
publication. Changes or updates to these referenced documents or URLs
may necessitate repairs to the links and/or amendment of the run
rules. The most current run rules are available at the
SPECvirt_sc2013 web site. SPEC notifies members and licensees
whenever it makes changes to the suite.
In a virtualized environment, the
definitions of commonly-used terms can have multiple or different
meanings. To avoid ambiguity, this section attempts to define terms
that are used throughout this document:
For further definition or explanation of
workload-specific terms, refer to the respective documents of the
original benchmarks.
These requirements apply to all hardware
and software components used in producing the benchmark result,
including the SUT, network, and clients.
For a run to be valid, the following
attributes must hold true:
This section outlines some of the
environmental and other electrical requirements related to power
measurement while running the benchmark. Note that power
measurement is optional, so this section only applies to results with
power in the Performance/Power categories.
To
produce a compliant result for either Performance/Power of the Total
SUT (SPECvirt_sc2013_PPW) or Performance/Power of the Server only,
(SPECvirt_sc2013_ServerPPW) the following requirements must be met in
addition to the environmental and electrical requirements described in
this section.
Line Voltage Source
The preferred Line Voltage source used for
measurements is the main AC power as provided by local utility
companies. Power generated from other sources often has unwanted
harmonics which are incapable of being measured correctly by many
power analyzers, and thus would generate inaccurate results.
The usage of an uninterruptible power
source (UPS) as the line voltage source is allowed, but the voltage
output must be a pure sine-wave. For placement of the UPS, see Power
Analyzer Setup below. This usage must be specified in the Notes
section of the report.
Systems that are designed to be able to run normal operations without
an external source of power cannot be used to produce valid results.
Some examples of disallowed systems are notebook computers, hand-held
computers/communication devices and servers that are designed to
frequently operate on integrated batteries without external power.
Systems
with batteries intended to preserve operations during a temporary
lapse of external power, or to maintain data integrity during an
orderly shutdown when power is lost, can be used to produce valid
benchmark results. For SUT components that have an integrated battery,
the battery must be fully charged at
the
end
of the measurement interval, or proof must be provided that
it is charged at least to the level of charge at the beginning of the
interval.
Note that integrated batteries that are
intended to maintain such things as durable cache in a storage
controller can be assumed to remain fully charged. The above paragraph
is intended to address “system” batteries that can provide primary
power for the SUT.
DC line voltage sources are currently not
supported.
For situations in which the appropriate
voltages are not provided by local utility companies (e.g. measuring a
server in the United States which is configured for European markets,
or measuring a server in a location where the local utility line
voltage does not meet the required characteristics), an AC power
source may be used, and the power source must be specified in the
notes section of the disclosure report. In such situation the
following requirements must be met, and the relevant measurements or
power source specifications disclosed in the general notes section of
the disclosure report:
SPEC requires that power measurements be
taken in an environment representative of the majority of usage
environments. The intent is to discourage extreme environments that
may artificially impact power consumption or performance of the
server.
The following environmental conditions
must be met:
Power Analyzer Setup
The power analyzer must be located
between the AC Line Voltage Source and the SUT. No other active
components are allowed between the AC Line Voltage Source and the
SUT. If the SUT consists of several discrete parts (server and
storage), separate power analyzers may be required.
Power analyzer configuration settings
that are set by the SPEC PTDaemon must not be manually overridden.
Power Analyzer Specifications
To ensure comparability and repeatability
of power measurements, SPEC requires the following attributes for
the power measurement device used during the benchmark. Please note
that a power analyzer may meet these requirements when used in some
power ranges but not in others, due to the dynamic nature of power
analyzer Accuracy and Crest Factor.
For example:
An
analyzer with a vendor-specified uncertainty of +/- 0.5% of reading
+/- 4 digits, used in a test with a maximum wattage value of 200W,
would have "overall" uncertainty of (((0.5%*200W)+0.4W)=1.4W/200W) or
0.7% at 200W.
An
analyzer with a wattage range 20-400W, with a vendor-specified
uncertainty of +/- 0.25% of range +/- 4 digits, used in a test
with a maximum wattage value of 200W, would have "overall" uncertainty
of (((0.25%*400W)+0.4W)=1.4W/200W) or 0.7% at 200W.
Temperature Sensor
Specifications
Temperature must be measured no more than
50mm in front of (upwind of) the main airflow inlet of the server.
To ensure comparability and repeatability of temperature measurements,
SPEC requires the following attributes for the temperature measurement
device used during the benchmark:
Supported and Compliant Devices
See the SPECpower
Device List for a list of currently supported (by the benchmark
software) and compliant (in specifications) power analyzers and
temperature sensors.
To scale the benchmark workload
additional tiles are added. The last tile may be configured as a
"fractional tile", which means a "load scale factor" of less than 1.0
is applied to all of the VMs within that tile. If used, the load scale
factor must be between 0.1 and 0.9 in 0.1 increments (e.g. 0.25 would
not be allowed). Each VM is required to be a distinct entity; for
example, you cannot run the application server and the database on the
same VM. The following block diagram shows the tile architecture and
the virtual machine/hypervisor/driver relationships:
Note
that there are more virtual machines than client drivers; this is
because the Infrastructure Server and Database Server VMs do not
interact directly with the client. Specifically, the Web Server
VM must access parts of its fileset and the backend simulator (BeSim)
via inter-VM communication to the Infrastructure Server.
Similarly, the Application Server VM accesses the Database Server VM
via inter-VM communication.
The operating systems may vary
between virtual machines within a tile. All specific workload
VMs (guest OS type and application software) across all tiles must be
identical, including fractional tiles. Examples of parameters
that must remain identical include:
The intent is that
workload-specific VMs across tiles are "clones", with only the
modifications necessary to identify them as different entities (e.g.
host name and network address).
As Internet email is
defined by its protocol definitions, the mail server requires
adherence to the relevant protocol standards:
RFC
2060 : Internet Mail Application Protocol - Version 1 (IMAP4)
The IMAP4 protocol implies the following:
RFC
791 : Internet Protocol (IPv4)
RFC
792 : Internet Control Message Protocol (ICMP)
RFC
793 : Transmission Control Protocol (TCP)
RFC
950 : Internet Standard Subnetting Procedure
RFC 1122 :
Requirements for Internet Hosts - Communication Layers
Internet standards are evolving standards.
Adherence to related RFC's (e.g. RFC
1191 Path MTU Discovery) is also acceptable provided the
implementation retains the characteristic of interoperability with
other implementations.
The J2EE server must provide a runtime
environment that meets the requirements of the Java 2 Platform,
Enterprise Edition, (J2EE) Version 1.3 or later specifications during
the benchmark run.
A major new version (i.e. 1.0, 2.0, etc.)
of a J2EE server must have passed the J2EE Compatibility Test Suite
(CTS) by the product's general availability date.
A J2EE Server that has passed the J2EE
Compatibility Test Suite (CTS) satisfies the J2EE compliance
requirements for this benchmark regardless of the underlying hardware
and other software used to run the benchmark on a specific
configuration, provided the runtime configuration options result in
behavior consistent with the J2EE specification. For example, using an
option that violates J2EE argument passing semantics by enabling a
pass-by-reference optimization, would not meet the J2EE compliance
requirement.
Comment: The intent of this requirement is to ensure that the J2EE server is a complete implementation satisfying all requirements of the J2EE specification and to prevent any advantage gained by a server that implements only an incomplete or incompatible subset of the J2EE specification.
SPECvirt_sc2013
requires that each Application Server VM execute it own locally
installed emulator application (emulator.EAR). This differs from the
original SPECjAppServer2004 workload definition.
All tables must have the properly scaled
number of rows as defined by the database population requirements in
the "Application and Database Server Benchmark" section of the SPECvirt_sc2013
Design Overview.
Additional database objects or DDL
modifications made to the reference schema scripts in the schema/sql
directory in the SPECjAppServer2004 Kit must be disclosed along with
the specific reason for the modifications. The base tables and indexes
in the reference scripts cannot be replaced or deleted. Views are not
allowed. The data types of fields can be modified provided they are
semantically equivalent to the standard types specified in the
scripts.
Comment: Replacing CHAR with VARCHAR would be considered semantically equivalent. Changing the size of a field (for example: increasing the size of a char field from 8 to 10) would not be considered semantically equivalent. Replacing CHAR with INTEGER (for example: zip code) would not be considered semantically equivalent.
Modifications that a customer may make for
compatibility with a particular database server are allowed. Changes
may also be necessary to allow the benchmark to run without the
database becoming a bottleneck, subject to approval by SPEC. Examples
of such changes include:
Comment:
Schema scripts provided by the vendors in the schema/<vendor>
directories are for convenience only. They do not constitute the
reference or baseline scripts in the schema/sql directory. Deviations
from the scripts in the schema/sql directory must still be disclosed
in the submission file even though the vendor-provided scripts were
used directly.
In any committed state the primary key
values must be unique within each table. For example, in the case of a
horizontally partitioned table, primary key values of rows across all
partitions must be unique.
The
databases must be populated using the supplied load programs or
restored from a database copy in a correctly populated state that was
populated using the supplied load programs prior to the start of each
benchmark run.
Modifications to the load programs are
permitted for porting purposes. All such modifications made must be
disclosed in the Submission File.
As the WWW is defined by its interoperative
protocol definitions, the Web server requires adherence to the
relevant protocol standards. It is expected that the Web server is
HTTP 1.1 compliant. The benchmark environment shall be governed by the
following standards:
A
compliant result must use the cipher suite listed above, and must
employ the 1024 bit key for RSA public key encryption, 128-bit key for
RC4 bulk data encryption, and have a 128-bit output for the Message
Authentication code.
For
further explanation of these protocols, the following might be
helpful:
The current text of all IETF RFC's may be
obtained from: http://ietf.org/rfc.html
All marketed standards that a software product states as being adhered
to must have passed the relevant test suites used to ensure compliance
with the standards.
For a run to be valid, the following
attributes must hold true:
The Infrastructure VM has the same
requirements as the Web Server VM in its role as a web back-end
(BeSim) for the web workload. However, the current
implementation allows HTTP 1.0 requests be sent from the webserver to
the httpd on the infraserver. The php code allows for non-persistent
connections to the BeSim backend implementations that would otherwise
not handle a persistent connection (for example fast-cgi on pre-fork
Apache httpd). This is controlled by the flag BESIM_PERSISTENT in
SPECweb/Test.conf and by default is set to 0 (use HTTP 1.0
non-persistent connections to besim).
The infraserver VM also hosts the download
files for the webserver using a file system protocol for remote file
sharing (for example NFS or CIFS).
For a run to be valid, each batch server VM must have at least 512 MB of memory allocated. The operating system of the batch server VM must be of the same type and version as at least one other VM in the tile. The batch server VM does not need to contain the other VM's workload-specific application software stack. The intent of these requirements is to prohibit vendors from artificially limiting and tuning in order to take advantage of the batch server's limited functionality.
The batch server workload is provided as source code. In order to be compliant, the workload must be built such that it conforms to the "base" optimization requirements in Section 2.0 of the SPEC CPU2006 run and reporting rules. In particular:The SPECvirt_sc2013 individual workload
metrics represent the aggregate throughput that a server can support
while meeting quality of service (QoS) and validation
requirements. In the benchmark run, one or more tiles are run
simultaneously. The load generated is based on page requests, database
transactions, and IMAP operations as defined in the SPECvirt_sc2013
Design Overview.
The QoS requirements are relative to the
individual workloads. These include:
The load
generated is based on page requests, transition between pages and the
static images accessed within each page.
The QoS
requirements are defined in terms of two parameters, Time_Good and
Time_Tolerable. QoS requirements are page based, Time_Good and
Time_Tolerable values are defined as 3 seconds and 5 seconds
respectively. For each page, 95% of the page requests (including all
the embedded files within that page) are expected to be returned
within Time_Good and 99% of the requests within Time_Tolerable.
Very large static files (i.e. Support downloads) use specific byte
rates as their QoS requirements.
The validation
requirement is such that less than 1% of requests for any given page
and less than 0.5% of all the page requests in a given test iteration
fail validation.
It is required
in this benchmark that all user sessions be run at the "high-speed
Internet" speed of 100 kilobytes/sec.
In addition,
the URL retrievals (or operations) performed must also meet the
following quality criteria:
For
each IMAP operation type, 95% of all transactions must complete within
five seconds. Additionally for each IMAP operation type, there may be
no more than 1.5% failures (where a failure is defined as transactions
that return unexpected content, or time-out). The total failure count
across all operation types must be no more than 1% of the count of all
operations.
The client polls the Batch Server periodically to ensure that the VM is running and responsive. To meet the Batch Server QoS requirement, 99.5% of all polling requests must be responded to within one second.
The batch server workload consists of one or more runspec invocations that execute one or more copies of the training workload of 401.bzip2 module in the SPEC CPU2006 benchmark. The total number copies run across all runspec invocations is defined by BATCH_COPY_COUNT. All invocations of the SPEC CPU2006 runspec command must be performed from within a single batch script; see the SPECvirt User's Guide for details on building the batch script. The batch script must complete all runspec commands within DURATION seconds. If the batch workload does not complete in DURATION seconds, it fails its QoS. The batch workload is executed INTERVALS number of times during the course of the full SPECvirt_sc2013 benchmark run. The batch workload execution starts are separated by TIMEINTERVAL seconds. All workload runs must complete during the benchmark measurement phase in order to pass QoS.
Business
Transactions are selected by the Driver based on the mix shown in the
following table. The actual mix achieved in the benchmark must be
within 5% of the targeted mix for each type of Business Transaction.
For example, the browse transactions can vary between 47.5% to 52.5%
of the total mix. The Driver checks and reports on whether the mix
requirement was met.
Business
Transaction Mix Requirements |
|
Business
Transaction Type |
Percent
Mix |
Purchase |
25% |
Manage |
25% |
Browse |
50% |
The Driver
measures and records the Response Time of the different types of
Business Transactions. Only successfully completed Business
Transactions in the Measurement Interval are included. At least 90% of
the Business Transactions of each type must have a Response Time of
less than the constraint specified in the table below. The average
Response Time of each Business Transaction's type must not be greater
than 0.1 seconds more than the 90% Response Time. This requirement
ensures that all users see reasonable response times. For example, if
the 90% Response Time of purchase transactions is 1 second, then the
average cannot be greater than 1.1 seconds. The Driver checks and
reports on whether the response time requirements were met.
Response
Time Requirements |
|
Business
Transaction Type |
90%
RT (in seconds) |
Purchase |
2 |
Manage |
2 |
Browse |
2 |
For each
Business Transaction, the Driver selects cycle times from a negative
exponential distribution, computed from the following equation:
Tc = -ln(x) * 10
where:
Tc = Cycle Time
ln = natural log (base e)
x = random number with at least 31 bits of precision,
from a uniform distribution such that (0 < x <= 1)
The
distribution is truncated at 5 times the mean. For each
Business Transaction, the Driver measures the Response Time Tr
and computes the Delay Time Td as Td = Tc - Tr. If Td
> 0, the Driver sleeps for this time before beginning the next
Business Transaction. If the chosen cycle time Tc is smaller
than Tr, then the actual cycle time (Ta) is larger than
the chosen one.
The average
actual cycle time is allowed to deviate from the targeted one by 5%.
The Driver checks and reports on whether the cycle time requirements
were met.
The table below
shows the range of values allowed for various quantities in the
application. The Driver checks and reports on whether these
requirements were met.
Miscellaneous
Dealer Requirements |
|||
Quantity |
Targeted
Value |
Min.
Allowed |
Max.
Allowed |
Average Vehicles per Order |
26.6 |
25.27 |
27.93 |
Vehicle Purchasing Rate (/sec) |
6.65
* Ir |
6.32*
Ir |
6.98
* Ir |
Percent Purchases that are
Large Orders |
10 |
9.5 |
10.5 |
Large Order Vehicle Purchasing
Rate (/sec) |
3.5
* Ir |
3.33
* Ir |
3.68
* Ir |
Average # of Vehicles per
Large Order |
140 |
133 |
147 |
Regular Order Vehicle
Purchasing Rate (/sec) |
3.15
* Ir |
2.99
* Ir |
3.31
* Ir |
Average # of Vehicles per
Regular Order |
14 |
13.3 |
14.7 |
The metric for
the Dealer Domain is Dealer Transactions/sec, composed of the
total count of all Business Transactions successfully completed during
the measurement interval divided by the length of the measurement
interval in seconds.
The M_Driver
measures and records the time taken for a work order to complete. Only
successfully completed work orders in the Measurement Interval are
included. At least 90% of the work orders must have a Response Time of
less than 5 seconds. The average Response Time must not be greater
than 0.1 seconds more than the 90% Response Time.
The table below
shows the range of values allowed for various quantities in the
Manufacturing Application. The M_Driver checks and reports on whether
the run meets these requirements.
Miscellaneous
Manufacturing Requirements |
|||
Quantity |
Targeted
Value |
Min.
Allowed |
Max.
Allowed |
LargeOrderline
Widget Rate/sec |
3.5 * Ir |
3.15 * Ir |
3.85 * Ir |
Planned Line Widget Rate/sec |
3.15 * Ir |
2.835 * Ir |
3.465 * Ir |
Workload-specific configuration files are
supplied with the harness. All configurable parameters are listed in
these files. For a run to be valid, all the parameters in the
configuration files must be left at default values, except for the
ones that are marked and listed clearly as "Configurable Workload
Properties".
To configure the initial benchmark
environment from scratch, the benchmarker:
To run the benchmark, the benchmarker must:
NOTE: This section is
only applicable to results that have power measurement, which is
optional.
The measurement of power should meet all
the environmental aspects listed in Environmental
Conditions. The SPECvirt_sc2013 benchmark tools provide the
ability to automatically gather measurement data from supported power
analyzers and temperature sensors and integrate that data into the
benchmark result. SPEC requires that the analyzers and sensors used in
a submission be supported by the measurement framework. The provided
tools (or a newer version provided by SPEC) must be used to run and
produce measured SPECvirt_sc2013 results.
The primary metrics, SPECvirt_sc2013_PPW
(performance with SUT power) and SPECvirt_sc2013_ServerPPW
(performance with Server only power) are performance per watt metrics
obtained by dividing the peak performance by the peak power of the SUT
or Server, respectively, during the run measurement phase. For
example, if the SPECvirt_sc2013 result consisted of a maximum of 6
tiles, the power would be calculated as the average power while
serving transactions within all 6 workload tiles.
During
the measurement phase, the SPECvirt_sc2013 prime controller polls each
prime client process associated with each workload in each tile once
every 10 seconds. The prime controller collects and records the
workload polling data which includes performance and QoS measurement
data from the clients. It is expected that in a compliant run
all polling requests are responded to within 10 seconds
(BEAT_INTERVAL). Failure to respond to polling requests may indicate
problems with the clients' ability to issue and respond to workload
requests in a timely manner or accurately record performance.
The
prime controller process detects that each polling request is
responded to by the prime client processes, the prime controller
invalidates the test if more than one 10-second polling interval is
missed during the test's measurement phase. The test aborts, and the
run is marked as non-compliant.
The reported performance metric,
SPECvirt_sc2013, appears in both Performance/Power and Performance
categories and is derived from a set of compliant results from the
workloads in the suite:
The SPECvirt_sc2013 metric is a
"supermetric" that is the arithmetic mean of the normalized
submetrics for each workload. The metric is output in the format
"SPECvirt_sc2013 <score> @ <# vms> VMs".
The optional reported performance/watt
metrics, SPECvirt_sc2013_PPW and SPECvirt_sc2013_ServerPPW, represents
the peak performance divided by the average power of the SUT and
server respectively during the peak run phase. These metrics
only appear in results in the Performance/Power categories, and the
result must not be compared with results that do not have power
measured. These metrics are output in the format
"SPECvirt_sc2013_PPW <score> @ <# vms> VMs" and
"SPECvirt_sc2013_ServerPPW <score> @ <# vms> VMs"
Please consult the SPEC Fair Use Rule on the
treatment of estimates at http://www.spec.org/fairuse.html#SPECvirt_sc2013.
The report of results for the SPECvirt_sc2013 benchmark
is generated in ASCII and HTML format by the provided SPEC tools.
These tools may not be changed without prior SPEC approval. The tools
perform error checking and flag some error conditions as resulting in
an "invalid run". However, these automatic checks are only there
for debugging convenience, and do not relieve the benchmarker of the
responsibility to check the results and follow the run and reporting
rules.
SPEC reviews and accepts for publication on
SPEC's website only a complete and compliant set of results run and
reported according to these rules. Full disclosure reports of
all test and configuration details as described in these run and
report rules must be made available. Licensees are encouraged to
submit results to SPEC for publication.
All system configuration information
required to duplicate published performance results must be reported.
Tunings not in the default configuration for software and hardware
settings must be reported. All tiles must be tuned identically.
The following SUT hardware components must
be reported:
The
SUT must utilize stable storage. Additionally, the SUT must use stable
and durable storage for all virtual machines (including all
corresponding data drives), such that a single drive failure does not
incur data loss on the VMs. For example: RAID 1, 5, 10, 50, 0+1 are
acceptable RAID levels, but RAID 0 (striping without mirroring or
parity) is not considered durable.
The hypervisor must be able to recover the
virtual machines, and the virtual machines must also be able to
recover their data sets, without loss from multiple power failures
(including cascading power failures), hypervisor and guest operating
system failures, and hardware failures of components (e.g. CPU) other
than the storage medium. At any point where the data can be cached,
after any virtual server has accepted the message and acknowledged a
transaction, there must be a mechanism to ensure any cached data
survives the server failure.
If an UPS is required by the SUT to meet
the stable storage requirement, the benchmarker is not required to
perform the test with an UPS in place. The benchmarker must
state in the disclosure that an UPS is required. Supplying a model
number for an appropriate UPS is encouraged but not required.
If a battery-backed component is used to
meet the stable storage requirement, that battery must have
sufficient power to maintain the data for at least 48 hours to allow
any cached data to be committed to media and the system to be
gracefully shut down. The system or component must also be able to
detect a low battery condition and prevent the use of the caching
feature of the component or provide for a graceful system shutdown.
Hypervisors
are required to safely store all completed transactions to its
virtualized workloads (including failure of the hypervisor's own
storage):
The following SUT software components must
be reported:
A brief description of the network
configuration used to achieve the benchmark results is required. The
minimum information to be supplied is:
The following load generator hardware
components must be reported:
The dates of general customer availability
must be listed for the major components: hardware, software
(hypervisor, operating systems, and applications), month and year. All
the system, hardware and software features are required to be
generally available on or before date of publication, or within 3
months of the date of publication (except where precluded by these
rules). With multiple components having different
availability dates, the latest availability date must be listed.
Products are considered generally available
if they are orderable by ordinary customers and ship within a
reasonable time frame. This time frame is a function of the product
size and classification, and common practice. The availability of
support and documentation for the products must coincide with the
release of the products.
Hardware products that are still supported
by their original or primary vendor may be used if their original
general availability date was within the last five years. The
five-year limit is waived for hardware used in clients.
For ease and cost of benchmarking, storage
and networking hardware external to the server such as disks, storage
enclosures, storage controllers and network switches, which were
generally available within the last five years but are no longer
available from the original vendor, may be used. If such end-of-life
(and possibly unsupported) hardware is used, then the test sponsor
represents that the performance measured is no better than 105% of the
performance on hardware available as of the date of publication. The
product(s) and their end-of-life date(s) must be noted in the
disclosure. If it is later determined that the performance using
available hardware to be lower than 95% of that reported, the result
shall be marked non-compliant
(NC).
Software products that are still supported
by their original or primary vendor may be used if their original
general availability date was within the last three years.
In the disclosure, the benchmarker must
identify any component that is no longer orderable by ordinary
customers.
If pre-release hardware or software is
tested, then the test sponsor represents that the performance measured
is generally representative of the performance to be expected on the
same configuration of the release system. If it is later determined
that the performance using available hardware or software to be lower
than 95% of that reported, the result shall be marked non-compliant
(NC).
SPECvirt_sc2013 does permit Open Source
Applications outside of a commercial distribution or support contract
with some limitations. The following are the rules that govern the
admissibility of the Open Source Application in the context of a
benchmark run or implementation. Open Source Applications do not
include shareware and freeware, where the source is not part of the
distribution.
Submission
date |
Beginning
of time window |
Aug
20, 2019 |
Feb
20, 2019 |
Jul
20, 2019 |
Jan
20, 2019 |
Jun
20, 2019 |
Dec
20 2018 |
The reporting page must list the date the
test was performed, month and year, the organization which performed
the test and is reporting the results, and the SPEC license number of
that organization.
This section is used to document:
Once you have a compliant run and wish to
submit it to SPEC for review, you need to provide the following:
Once you gather and bundle the required
files listed above, please email SPECvirt_sc2013 submissions to subvirt_sc2013@spec.org.
In order to publicly disclose
SPECvirt_sc2013 results, the submitter must adhere to these reporting
rules in addition to having followed the run rules described in this
document. The goal of the reporting rules is to ensure the SUT is
sufficiently documented such that someone could reproduce the test and
its results.
Compliant runs need to be submitted to SPEC
for review and must be accepted prior to public disclosure.
If public statements using
SPECvirt_sc2013 are made they must follow the SPEC Fair Use Rule (http://www.spec.org/fairuse.html).
Many other SPEC benchmarks allow duplicate
submissions for a single system sold under various names. Each
SPECvirt_sc2013 result from a power enabled run submitted to SPEC or
made public must be for an actual run of the benchmark on the SUT
named in the result. Electrically equivalent submissions for power
enabled runs are not allowed, unless it is also mechanically
equivalent (e.g. rebadged).
The submitter is required to run a script
that collects available configuration details of the SUT and all the
virtual machines used for the benchmark, including:
The primary reason for this step is to
ensure that there are not subtle differences that the vendor may
miss.
The
submitter is required to run a script which provides the details
of each VM, its operating system and application tunings that is not
captured in the SUT configuration collection script including:
During
a review of the result, the submitter may be required to provide, upon
request, additional details of the VM, operating system and
application tunings and log files that may not be captured in the
above script. These may include, but are not limited to:
The
primary reason for this step is to ensure that the vendor has
disclosed all non-default tunings.
The
submitter is required to run a script which collects the details of
each type or uniquely configured physical and virtual client used,
such that the testbed's client configuration could be
reproduced. Clients that are clones of a specific and documented
type may be identified and data collection is encouraged but not
required. The client collection script should collect
files and output of commands to document the client configuration and
tuning details including:
During
a review of the result, the submitter may be required to provide, upon
request, additional details of the client configuration
that may not be captured in the above script to help document details
relevant to questions that may arise during the review.
The
submitter must submit the Configuration Collection Archive containing
the data (files and command output) described sections 4.0,
4.1.and
4.2 above using the high level directory structure
described below as the foundation:
SPEC provides client driver software, which
includes tools for running the benchmark and reporting its
results. The client drivers are written in Java; precompiled
class files are included with the kit, so no build step is necessary.
Recompilation of the client driver software is not allowed, unless
prior approval from SPEC is given.
This software implements various checks for
conformance with these run and reporting rules; therefore, the SPEC
software must be used as provided. Source code modifications are not
allowed, unless prior approval from SPEC is given. Any such
substitution must be reviewed and deemed "performance-neutral" by the
OSSC.
The kit also includes source code for the
file set generators, script code for the web server, and other
necessary components.
SPECvirt_sc2013 uses modified versions of
SPECweb2005, SPECjAppServer2004, and SPECmail2008 for its virtualized
workloads. For reference, the run rules for those benchmarks are
listed below:
NOTE:
Not all of these run rules are applicable to SPECvirt, but when a
compliance issue is raised, SPEC reserves the right to refer back to
these individual benchmarks' run rules as needed for clarification.
Java® is a registered trademark
of Oracle Corporation.