SPECsip_Infrastructure2011 Run and Reporting Rules
Revision Date: July 20th, 2011
Table of Contents
1 Introduction
This document specifies the guidelines for how SPECsip_Infrastructure2011 is to be run for measuring and publicly reporting performance results. These rules abide by the norms laid down by the SPEC SIP Subcommittee and approved by the SPEC Open Systems Steering Committee. They ensure that results generated with this suite are meaningful, comparable to other generated results, and are repeatable (with documentation covering factors pertinent to duplicating the results). Per the SPEC license agreement, all results publicly disclosed must adhere to these Run and Reporting Rules.
1.1 Philosophy
SPEC believes the user community will benefit from an objective series of tests, which can serve as common reference and be considered as part of an evaluation process. SPEC is aware of the importance of optimizations in producing the best system performance. SPEC is also aware that it is sometimes hard to draw an exact line between legitimate optimizations that happen to benefit SPEC benchmarks and optimizations that specifically target the SPEC benchmarks. However, with the list below, SPEC wants to increase awareness of implementers and end users to issues of unwanted benchmark-specific optimizations that would be incompatible with SPEC's goal of fair benchmarking. SPEC expects that any public use of results from this benchmark suite shall be for Systems Under Test (SUTs) and configurations that are appropriate for public consumption and comparison. Thus, it is also required that:
- Hardware and software used to run this benchmark must provide a suitable environment for handling standard SIP requests and responses.
- Optimizations utilized must improve performance for a larger class of workloads than those defined by this benchmark suite. There must be no benchmark specific optimizations.
- The SUT and configuration is generally available, documented, supported, and encouraged by the providers.
To ensure that results are relevant to end-users, SPEC expects that the hardware and software implementations used for running the SPEC benchmarks adhere to following conventions:
- Proper use of the SPEC benchmark tools as provided.
- Availability of an appropriate full disclosure report.
- Support for all of the appropriate protocols.
1.2 Caveat
SPEC reserves the right to investigate any case where it appears that these guidelines and the associated benchmark run and reporting rules have not been followed for a published SPEC benchmark result. SPEC may request that the result be withdrawn from the public forum in which it appears and that the benchmarker correct any deficiency in product or process before submitting or publishing future results. SPEC reserves the right to adapt the benchmark codes, workloads, and rules of SPECsip_Infrastructure2011 as deemed necessary to preserve the goal of fair benchmarking. SPEC will notify members and licensees if changes are made to the benchmark and will rename the metrics (e.g. from SPEC SPECsip_Infrastructure2011 to SPECsip_Infrastructure2011a). Relevant standards are cited in these run rules as URL references, and are current as of the date of publication. Changes or updates to these referenced documents or URL's may necessitate repairs to the links and/or amendment of the run rules. The most current run rules will be available at the SPEC web site at http://www.spec.org. SPEC will notify members and licensees whenever it makes changes to the suite. To help assure that these principles are met, any organization or individual who makes public use of SPEC benchmark results must do so in accordance with the SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html. In the case where it appears that these guidelines have not been adhered to, SPEC may investigate and request that the published material be corrected.
2 Run Rules for the SPECsip_Infrastructure2011 Benchmark
The production of compliant SPECsip_Infrastructure2011 test results requires that the tests be run in accordance with these run rules. These rules relate to the requirements for the System Under Test (SUT) and the testbed (i.e. SUT, clients, and network), including applicable protocols or other standards, operation, configuration, test staging, optimizations and measurement.
2.1 Protocols
As the Session Initiation Protocol (SIP) is defined by its interoperative protocol definitions, SPECsip_Infrastructure2011 requires adherence to the relevant protocol standards. It is expected that the SIP server is SIP 2.0 compliant. The benchmark environment shall be governed by the following standards:
- RFC 3261 SIP: Session Initiation Protocol
- RFC 791 Internet Protocol (IPv4) (Standard)
- updated by RFC1349 Type of Service in the Internet Protocol Suite (Proposed Standard)
- RFC 792 Internet Control Message Protocol (Standard)
- updated by RFC 950 Internet Standard Subnetting Procedure (Standard)
- RFC 768 User Datagram Protocol
- updated by RFC 3168 The Addition of Explicit Congestion Notification (ECN) to IP (Proposed Standard)
- RFC 950 Internet Standard Subnetting Procedure (Standard)
- RFC 1122 Requirements for Internet Hosts - Communication Layers (Standard)
- updated by RFC 2474 Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers. (Proposed Standard)
- RFC 2617 HTTP Authentication: Basic and Digest Access Authentication
- RFC 2327 SDP: Session Description Protocol
- RFC 3264 An Offer/Answer Model with SDP
- RFC 6076 Basic Telephone SIP End-to-End Performance Metrics
For further explanation of these protocols, the following might be helpful:
- RFC 1180 TCP/IP Tutorial (RFC 1180) (Informational)
- RFC 2151 A Primer on Internet and TCP/IP Tools and Utilities (RFC 2151) (Informational)
- RFC 1321 MD5 Message Digest Algorithm (Informational)
- RFC 3665 Session Initiation Protocol (SIP) Basic Call Flow Examples
- RFC 4320 Actions Addressing Identified Issues with the Session Initiation Protocol's (SIP) Non-INVITE Transaction
- RFC 4458 Session Initiation Protocol (SIP) URIs for Applications such as Voicemail and Interactive Voice Response (IVR)
The current text of all IETF RFC's may be obtained from: http://ietf.org/rfc.html All marketed standards that a software product states as being adhered to must have passed the relevant test suits used to ensure compliance with the standards.
2.2 General Availability
The entire testbed (SUT, clients, and network) must be comprised of components that are generally available, or shall be generally available within three months of the first publication of these results. Products are considered generally available if they are orderable by ordinary customers and ship within a reasonable time frame. This time frame is a function of the product size and classification and common practice. Some limited quantity of the product must have shipped on or before the close of the stated availability window. Shipped products do not have to match the tested configuration in terms of CPU count, memory size, and disk count or size, but the tested configuration must be available to ordinary customers. The availability of support and documentation of the products must be coincident with the release of the products. Hardware products that are still supported by their original or primary vendor may be used if their original general availability date was within the last five years. The five-year limit is waived for hardware used in client systems. Software products that are still supported by their original or primary vendor may be used if their original general availability date was within the last three years. Community supported (open source) software products have more complex requirements; please see Section 3.5.7. Information must be provided in the disclosure to identify any component that is no longer orderable by ordinary customers.
2.3 Stable Storage
The SUT must utilize stable storage for the application specific reason. Application area systems are expected to safely store any application object it has accepted until application disposition of object. To do this, Application area systems must be able to recover the application objects without loss from multiple power failures (including cascading power failures), operating system failures, and hardware failures of components (e.g. CPU) other than the storage medium itself (e.g. disk, non-volatile RAM). At any point where the data can be cached, after the server has accepted the message and acknowledged its receipt, there must be a mechanism to ensure any cached message survives the server failure.
- Examples of stable storage include:
- Media commit of data; i.e. the message has been successfully written to the disk media; for example, the disk platter.
- An immediate reply disk drive with battery-backed on-drive intermediate storage or uninterruptible power system (UPS).
- Server commit of data with battery-backed intermediate storage and recovery software.
- Cache commit with uninterruptible power system (UPS).
- Examples which are not considered stable storage
- An immediate reply disk drive without battery-backed on-drive intermediate storage or uninterruptible power system (UPS).
- Cache commit without uninterruptible power system (UPS).
- Server commit of data without battery-backed intermediate storage and recovery software.
If an uninterruptible power system (UPS) is required by the SUT to meet the stable storage requirement, the benchmarker is not required to perform the test with an UPS in place. The benchmarker must state in the disclosure that an uninterruptible power system (UPS) is required. Supplying a model number of an appropriate UPS is encouraged but not required. If a battery-backed component is used to meet the stable storage requirement, that battery must have sufficient power to maintain the data for at least 48 hours to allow any cached data to be committed to media and the system to be gracefully shut down. The system or component must also be able to detect a low battery condition and prevent the use of the component or provide for a graceful system shutdown.
2.4 Single Logical Server
The SUT must present to application area clients the appearance and behavior of a single logical server for each protocol. Specifically, the SUT must present a single system view, in that the results of any application area transaction from a client that change the state on the SUT must be visible to any/all other clients on any subsequent application area transaction. For example, transaction state created by an INVITE can be modified by a subsequent re-INVITE or CANCEL operation, even if that operation originates from a different client than sent the original INVITE. For this reason, the benchmark requires the SUT to expose a single IP address. All components in a SUT will need to be described in a submission disclosure (Section 4 below).
2.5 Application Logging
For a run to be valid, the following attributes related to logging must hold true:
- Each log entry must contain information about each transaction, as specified in the Design Document, namely:
- Transaction type (e.g., INVITE, REGISTER, BYE, CANCEL)
- Final response code (e.g., 200 OK, 487 Canceled)
- From: User Name
- To: User Name
- Call-ID
- Time
- Additional requirements:
- Update frequency: At least once every 60 seconds.
- Format requirements: None, but a program must be provided to convert the log to text if the log is not in ASCII format.
2.6 Initializing the SUT for running the Benchmark
To make an official SPECsip_Infrastructure2011 test run, the benchmarker must perform the following steps:
- Populate the SUT with a sufficient number of users, where each user has a username/password combination of the form user12345678 password12345678
- Proxy should be configured to serve the domain sip.spec.org
- Proxy should be configured as a Record-Route: proxy
- Proxy should be configured as a transaction-stateful proxy
2.7 Running the Benchmark
For statistical reasons, SPECsipInfrastructure _2011 requires a minimum of 20,000 supported subscribers to ensure a valid run. Any run fewer than 20,000 supported subscribers is invalid. The benchmark consists of several periods:
- Initial registration period (10 minutes, configurable): This period is for dynamically configuring the SUT with the IP addresses of the clients. SIPp instances send REGISTER messages to the SUT to register their IP address and UDP port information. This information is needed before any client can make a call. The period is set to match what happens during steady state (each client registers every 10 minutes), but may be increased or decreased for convenience.
- Warmup period (configurable, 45 minute minimum): 45 minutes: This period is to allow the system to reach steady-state. This period can be increased beyond 45 minutes, but cannot be reduced below it.
- Run period (configurable, 60 minute minimum): This is the measurement interval. All statistics are gathered during this period. The period may be extended beyond 60 minutes if desired, but cannot be reduced below 60 minutes.
- Cool down period (not configurable): 5 minutes: This period is to ensure all clients see the same load level at the SUT. Since clients are not tightly synchronized, inevitably some client will be the last to measure throughput and response times. The cool down period ensures that all clients maintain their load level beyond the measurement period so that no client sees a lower load level during their measurement interval.
2.8 Optimization
Benchmark specific optimization is not allowed. Any optimization of either the configuration or software used on the SUT must improve performance for a larger class of workloads than that defined by this benchmark and must be supported and recommended by the provider. Optimizations that take advantage of the benchmarks specific features are forbidden. Examples of inappropriate optimization could include, but are not limited to:
- Registrations may not be permanent; expiration times must be honored.
2.9 Measurement
The provided SPECsip_Infrastructure2011 tools (e.g., binaries, JAR files, perl scripts) must be unmodified and used to run and produce measured SPECsip_Infrastructure2011 results. The SPECsip_Infrastructure2011 metric is a function of the workload, the associated benchmark specific working set, and the defined benchmark specific criteria. SPECsip_Infrastructure2011 results are not comparable to any other application area performance metric.
2.10 Metric
SPECsip_Infrastructure2011 expresses performance in terms of simultaneous number of supported subscribers. The definition of this metric is provided in detail in the Design Document.
2.11 Workload
The SPECsip_Infrastructure2011 workload is described in detail in the Design Document.
2.12 Quality of Service Criteria
The SPECsip_Infrastructure2011 benchmark has specific Quality of Service (QoS) criteria for response times, delivery times and error rates. These criteria are specified in the Design Document and checked for by the benchmark tools.
2.13 Load Generators
The SPECsip_Infrastructure2011 benchmark requires the use of one or more client systems to act as load generators. One client system is designated the harness client and this system will be the one on which the command that initiates the benchmark run. Clients must be instruction-set compatible. Please refer to the User Guide and Design Document for more detail on these roles. A server component of the SUT may not be used as a load generator when testing to produce valid SPECsip_Infrastructure2011 results. A server component may be used as the prime client, but this is not recommended. In order to run the benchmark tools, the client systems must include any requirements such as software versions or other configuration requirements.
2.14 SPECsip_Infrastructure2011 Parameters
The SPECsip_Infrastructure2011 User's Guide provides detailed documentation on what parameters are available to the user for modification.
3 Reporting Rules
In order to publicly disclose SPECsip_Infrastructure2011 results, the benchmarker must adhere to these reporting rules in addition to having followed the run rules above. The goal of the reporting rules is to ensure the SUT and testbed are sufficiently documented such that someone could reproduce the test and its results.
3.0.1 Publication
SPEC requires that each licensee test location (city, state/province and country) measure and submit a single compliant result for review, and have that result accepted, before publicly disclosing or representing as compliant any SPECsip_Infrastructure2011 result. Only after acceptance of a compliant result from that test location by the subcommittee may the licensee publicly disclose any future SPECsip_Infrastructure2011 result produced at that location in compliance with these run and reporting rules, without acceptance by the SPECs SIP subcommittee. The intent of this requirement is that the licensee test location demonstrates the ability to produce a compliant result before publicly disclosing additional results without review by the subcommittee. SPEC encourages the submission of results for review by the relevant subcommittee and subsequent publication on SPEC's web site. Licensees, who have met the requirements stated above, may publish compliant results independently; however, any SPEC member may request a full disclosure report for that result and the test sponsor must comply within 10 business days. Issues raised concerning a result's compliance to the run and reporting rules will be taken up by the relevant subcommittee regardless of whether or not the result was formally submitted to SPEC.
3.1 Metrics And Results Reports
The benchmark single figure of merit, SPECsip_Infrastructure2011 Number of Supported Subscribers, is the number of users supported by the system while satisfying the appropriate QoS requirements as described in the Design Document. A complete benchmark result is comprised of benchmark specific description, shown on the results reporting page. A detailed breakdown of each test is included on the reporting page. The report of results for the SPECsip_Infrastructure2011 benchmark is generated in HTML by the provided SPEC tools. These tools may not be changed, except for portability reasons with prior consent from the SPEC SIP
SubCommittee. The tools perform error checking and will flag some error conditions as resulting in an "invalid run". However, these automatic checks are only there for debugging convenience, and do not relieve the benchmarker of the responsibility to check the results and follow the run and reporting rules. The section of the output.raw file that contains actual test measurement must not be altered. Corrections to the SUT descriptions may be made as needed to produce a properly documented disclosure.
3.2 Fair Use of SPECsip_Infrastructure2011 Results
Consistency and fairness are guiding principles for SPEC. To help assure that these principles are met, any organization or individual who makes public use of SPEC benchmark results must do so in accordance with the SPEC Fair Use Rule, as posted at http://www.spec.org/fairuse.html 3.3 Research and Academic usage of SPECsip_Infrastructure2011
Please consult the SPEC FairUse Rule on Research and Academic Usage at http://www.spec.org/fairuse.html#Academic
3.4 Categorization of Results
SPECsip_Infrastructure2011 will be categorized into single and multiple node results, where the terms single and multiple nodes are as defined in this section. Multiple nodes are again defined to be of two types, homogeneous and heterogeneous. Moreover, for multiple submissions involving homogeneous nodes, the subcommittee will also require a submission on a corresponding single-node platform (see details in the following paragraphs). A Single Node Platform for SPECsip_Infrastructure2011 consists of one or more processors executing a single instance of an OS and one or more instances of the same SIP server software. Externally attached storage for software may be used; all other performance critical operations must be performed within the single server node. A single common set of NICs must be used to relay all SIP traffic. Example: A Homogeneous Multi Node Platform for SPECsip_Infrastructure2011 consists of two or more electrically equivalent single Node Servers in a single chassis or connected through a shared bus. Each node contains the same number and type of processing units and devices and each node executes a single instance of an OS and one or more instances of the same SIP server software. Storage may be duplicated or shared. All incoming requests from the test harness must be load balanced by either by a single node that receives all incoming requests and balances the load across the other nodes (A) or by a separate load balancing appliance that serves that function (B). Each node must contain a single common set of NICs that must be used across all 3 workloads to relay all SIP traffic. If a separate load balancing appliance is used it must be included in the SUT's definition. A Heterogeneous/Solution Platform for SPECsip_Infrastructure2011 consists of any combination of server nodes and appliances that have been networked together to provide all the performance critical functions measured by the benchmark. All incoming requests from the test harness must be load balanced by either a single node that receives all incoming requests and balances the load across the other nodes or by a separate load balancing appliance that serves that function. Electrical equivalence between server nodes is not required. Storage may be duplicated or shared. Additional appliances that provide performance critical operations such as intelligent switches. All nodes and appliances used must be included in the SUT's definition. Examples: C & D.
3.5 Testbed Configuration
The system configuration information that is required to duplicate published performance results must be reported. This list is not intended to be all-inclusive, nor is each performance neutral feature in the list required to be described. The rule of if it affects performance or the feature is required to duplicate the results, describe it. Any deviations from the standard, default configuration for the SUT must be documented so an independent party would be able to reproduce the result without further assistance. For most of the following configuration details, there is an entry in the configuration file, and a corresponding entry in the tool-generated HTML result page. If information needs to be included that does not fit into these entries, the Notes sections must be used.
3.5.1 SUT Hardware
The SUT hardware configuration must not be changed between workload runs. However, not all hardware used in one workload is required to be used in another. In the case where multiple controllers are used for one workload, the same controllers must be electronically connected, and some subset of those controllers must be used, for the other workloads. The following SUT hardware components must be reported:
- Vendor's name
- System model number, type and clock rate of processor, number of processors, and main memory size.
- Size and organization of primary, secondary, and other cache, per processor. If a level of cache is shared among processors in a system that must be stated in the notes section of the disclosure.
- Memory configuration if this is an end-user option that may affect performance, e.g. interleaving and access time.
- Other hardware, e.g. write caches, or other accelerators
- Number, type, model, and capacity of disk controllers and drives
- Type of file system
The documentation of the hardware for a result in the Heterogeneous/Platform category must also include a diagram of the configuration.
3.5.2 SUT Software
The following SUT software components must be reported:
- SIP Server software and version.
- Operating System and version.
- Any other software packages used during the benchmarking process.
- Other clarifying information as required to reproduce benchmark results; e.g. number of daemons, server buffer cache size, disk striping, non-default kernel parameters, and logging mode, must be stated in the notes section of the disclosure.
- Additionally, the submitter must be prepared to make available a description of each of the tuning features that were utilized (e.g. kernel parameters, SIP software settings, etc.) including the purpose of that tuning feature. Where possible, it must be noted how the values used differ from the default settings for that tuning feature.
- Any software compiled by the submitter must supply the compiler version and non-default compiler tuning options used to build it.
2.5.3 Network Configuration
A brief description of the network configuration used to achieve the benchmark results is required. The minimum information to be supplied is:
- Number, type, and model of network controllers
- Number and type of networks used
- Base speed of network
- Number, type, model, and relationship of external network components to support SUT (e.g., any external routers, hubs, switches, etc.)
- A network configuration notes section may be used to list the following additional information:
- Number, type, model, and relationship of external network components to support SUT (e.g., any external routers, hubs, switches, etc.).
- Relationship of load generators, load generator type, and networks (including routers, hubs, switches, etc.) -- in short: which load generators are connected to which LAN segments. For example: "clients 1 and 2 on one ATM-622, clients 3 and 4 on second ATM-622, and clients 5, 6, and 7 each on their own 100TX segment."
- Number, type, model, and relationship of external network components.
3.5.4 Client Workload Generators
The following load generator hardware components must be reported:
- Number of client load generator systems
- System model number, processor type and clock rate, number of processors
- Main memory size
- Network Controller
- Operating System and Version
- JVM version used to run client
- Client weighting if the load is not distributed evenly across all clients
- Other performance critical Hardware
- Other performance critical Software
3.5.5 Configuration Diagram (if applicable)
A Configuration Diagram of the SUT must be provided in a common graphics format (e.g. .png, .jpeg, .gif). This will be included in the html formatted results page. An example would be a line drawing that provides a pictorial representation of the SUT including the network connections between clients, server nodes, switches and the storage hierarchy and any other complexities of the SUT that can best be described graphically.
3.5.6 General Availability Dates
The dates of general customer availability must be listed for the major components: hardware, server software, and operating system, by month and year. All the system, hardware and software features are required to be available within three months of the first publication of these results. With multiple components having different availability dates, the latest availability date must be listed. If pre-release hardware or software is tested, then the test sponsor represents that the performance measured is generally representative of the performance to be expected on the same configuration of the release system. If the sponsor later finds the performance to be lower than 5% of that reported for the pre-release system, then the sponsor shall resubmit a corrected test result. Hardware products that are still supported by their original or primary vendor may be used if their original general availability date was within the last five years. The five-year limit is waived for hardware used in client systems. Software products that are still supported by their original or primary vendor may be used if their original general availability date was within the last three years. In the disclosure, the benchmarker must identify any component that can no longer be ordered by ordinary customers. If pre-release hardware or software is tested, then the test sponsor represents that the performance measured is generally representative of the performance to be expected on the same configuration of the release system. If the sponsor later finds the performance to be lower than 5% of that reported for the pre-release system, then the sponsor shall resubmit a new corrected test result.
3.5.7 Rules on Community Supported Applications
In addition to the requirements stated in OSG Policy Document ( http://www.spec.org/osg/policy.html), the following guidelines will apply for a submissions that relies on Community Supported Applications. SPECsip_Infrastructure2011 does permit Community Supported Applications outside of a commercial distribution or support contract which meet the following guidelines. The following 8 items are the rules that govern the admissibility of any Community Supported Application executed on the SUT in the context of a benchmark run or implementation:
- Open Source operating systems or hypervisors would still require a commercial distribution and support. The following rules do not apply to Operating Systems used in the publication.
- Only a "stable" release can be used in the benchmark environment; “non-stable" releases (alpha, beta, or release candidates) cannot be used. A stable release must be unmodified source code or binaries as downloaded from the Community Supported site. A "stable" release is one that is clearly denoted as a stable release or a release that is available and recommended for general use. It must be a release that is not on the development fork, not designated as an alpha, beta, test, preliminary, pre-released, prototype, release-candidate, or any other terms that indicate that it may not be suitable for general use. The 3 month General Availability window (outlined above) does not apply to Community Supported Applications, since volunteer resources make predictable future release dates unlikely.
- The initial "stable" release of the application must be a minimum of 12 months old. Reason: This helps ensure that the software has real application to the intended user base and is not a benchmark special that's put out with a benchmark result and only available for the first three months to meet SPEC's forward availability window.
- At least two additional stable releases (major, minor, or bug fix) must have been completed, announced and shipped beyond the initial stable release. Reason: This helps establish a track record for the project and shows that it is actively maintained.
- The application must use a standard open source license such as one of those listed at http://www.opensource.org/licenses/.
- The "stable" release used in the actual test run must be the current stable release at the time the test result is run or the prior "stable" release if the superseding/current "stable" release will be less than 3 months old at the time the result is made public.
- The "stable" release used in the actual test run must be no older than 18 months. If there has not been a "stable" release within 18 months, then the open source project may no longer be active and as such may no longer meet these requirements. An exception may be made for mature projects (see below).
- In rare cases, open source projects may reach maturity where the software requires little or no maintenance and there may no longer be active development. If it can be demonstrated that the software is still in general use and recommended either by commercial organizations or active open source projects or user forums and the source code for the software is less than 20,000 lines, then a request can be made to the subcommittee to grant this software mature status. This status may be reviewed semi-annually. An example of a mature project would be the FastCGI library.
3.5.8 Test Sponsor
The reporting page must list:
- Organization which is reporting the results
- SPEC license number of that organization
- Date the test was performed, by month and year
3.5.9 Disclosure Notes
The Notes section is used to document:
- System tuning parameters other than default.
- Process tuning parameters other than default.
- MTU size of the network used.
- Background load, if any.
- Any portability changes made to the individual benchmark source code including module name, line number of the change.
- Information such as compilation options must be listed if the user is required to build the software from sources.
- Critical customer-identifiable firmware or option versions such as network and disk controllers.
- Additional important information required to reproduce the results from other reporting sections that require a larger text area.
- Any supplemental drawings or detailed written descriptions or pointers to same, that may be needed to clarify some portion of the SUT.
- Definitions of tuning parameters may be included or a pointer supplied to a separate document.
- Part numbers or sufficient information that would allow the end user to order the SUT configuration if desired.
- Identification of any components used that are supported but that are no longer orderable by ordinary customers.
In general, any changes or tuning to the system should be documented in order to support reproducibility.
3.6 Log File Review
The following additional information may be required to be provided for SPEC's results review:
- The SIP server log in any format, as specified by the SPECsip_Infrastructure2011 Design Document
- A program to display the log in ASCII text form, such as a perl script, in source form.
The submitter is required to keep the entire log file from both the SUT for the duration of the review period.
4 Submission Requirements for SPECsip_Infrastructure2011
Once you have a valid run and wish to submit it to SPEC for compliance review, you will need to provide the following:
- The combined output raw file containing ALL the information outlined in section 3.
- Any additional supplemental information, such as configuration diagrams, tuning descriptions or additional configuration information that helps explain the SUT but did not fit within the report format.
- Log files from the run upon request.
Once you have the submission ready, please email SPECsip_Infrastructure2011 to subsipinf2011@spec.org. Retain the following for possible request during the review:
- The SUT's application specific log files from the run in ASCII format
SPEC requires the submission of results for review by the SPEC SIP subcommittee and subsequent publication on SPEC's web site. Estimates are not allowed.
5 The SPECsip_Infrastructure2011 Benchmark Kit
SPEC provides client driver software, which includes tools for running the benchmark and reporting its results. This client driver is written in Java; precompiled class files are included with the kit, so no build step is necessary. This software implements various checks for conformance with these run and reporting rules. Therefore the SPEC software must be used. The kit also includes Java code for the user database generation, C code for SIPp, and perl code for post-processing the results output.