SPEC OSG Frequently Asked Questions

OSG General
1. Where can I get information about "SPEC" (not just "OSG")?
2. Which system/CPU/component should I purchase?
OSG Results
OSG CPU2000
OSG jAppServer2002
OSG jAppServer2001
OSG JBB2000
OSG JVM98
OSG Mail2001
OSG SFS97_R1
OSG Web99
OSG Web99_SSL
OSG Embedded
1. Does SPEC/OSG have any benchmarks for embedded systems?

OSG General

Where can I get information about "SPEC" (not just "OSG")?: Basic information about SPEC can be found in the general SPEC area. In particular there is a SPEC FAQ which answers questions about SPEC the organization, as opposed to about OSG's benchmarks.
Which system/CPU/component should I purchase?: Unfortunately, SPEC does not make recommendations on computer purchases. SPEC will provide what data it can (all available from the results index) but can not offer opinions on it. Without knowing a lot about your exact workload, it would be difficult to provide any useful specific advice and providing general advice can get us all in legal trouble.

OSG Results

Why can't I find a result for "XYZ"?

The results that are available on this web site are ones that have been submitted to SPEC.

SPEC does not run any benchmarks itself; we develop the benchmarks, we review and publish results submitted to us, but we do not have control over which companies participate nor which systems or configurations are submitted.

Many companies are quoting SPEC results. The licensing of the SPEC benchmark products allows companies to quote and publish results on their own so long as they adhere to all of the run and reporting rules for that benchmark. There is no requirement that these results are submitted to SPEC.

However, only those results available on this web site have been reviewed by SPEC. Any result not available here has not been reviewed. SPEC cannot speak for the validity of any result published elsewhere, the correctness and completeness of any such result is the responsibility of the licensee who has made the claim.

If there are results that you would like to see here, contact the company responsible and suggest that they submit results to SPEC. SPEC cannot require a result to be reviewed by us, but many vendors will submit if there is sufficient end-user demand.

What happened to the result for "XYZ"?

It is possible that a result that was available a short while ago may no longer appear in SPEC's listings.

It is SPEC's policy to be an archive for all results published by SPEC; however, on the rare occasions when an issue with rules compliance is discovered after publication, SPEC may determine that the affected result(s) need to be withdrawn. Note that lack of compliance can be due to several factors (e.g. oversight, unexpected benchmark behavior, etc.).

If it is discovered that a result may not be in full compliance with the necessary Run and Reporting Rules for that benchmark, the issue is brought before the appropriate technical committee. That technical committee reviews the issues related to compliance and makes a recommendation to the proper steering committee, which may then choose to have the result(s) withdrawn. If a result is to be withdrawn, it is removed from all of the affected listings and the full disclosures are replaced with files explaining that this result has been withdrawn because of compliance issues. It is expected that sometime in the near future the test sponsors will submit new results when they have resolved these issues.

Thankfully, this is not a common occurance.

How can I publish my benchmark results on the SPEC web site?

Benchmark licensees may submit results for publication on the SPEC web site. Please review the general guidelines and the review calendar ; specific information on how to submit results can be found in each benchmark's run and reporting rules and user guide documentation.

What SPEC rating is required to support "XYZ" application?

The methodology of determining the computer systems needed to support a particular workload is capacity planning. This is a different, though related, field to comparative benchmarking. Capacity planning requires the adaptation of a model to fit the behavior of the desired workload for accurate projections. Comparative benchmarking requires the explicit standardization of a benchmark workload (or model) for accurate comparisons. These different directions can lead to contradictory needs.

Good capacity planning models require a very good understanding of the aspects of your workload. It is important to understand that every workload is unique to some extent; there may be commonalities between similar tasks and applications, but there are often significant differences in behavior from minor differences in usage.

Some basic questions to understand before making capacity projections:

Are you commonly CPU bound? I/O bound? Memory constrained?
What are the I/O requirements?
What kind of system services are used, and how often?
Does the behavior remain consistent as the workload size is increased?

Once you understand these kind of issues with your own applications and workloads, then you can seek out available data that can be used for input to your model. SPEC benchmark results might be useful components for some parts of a capacity planning model.

For example, assume I have a UNIX system rated 3.3 SPECint95 - how can I test my application to determine at what point I will need more compute, at what point I will need to move to a system with a higher SPECint95 rating?

Normally, SPECint95 alone would not be sufficient. You might for instance run out of memory, I/O capacity, or network capacity before your application ran out of CPU capacity.

You can use SPECint95 to determine rough bounds on the problem. E.g., if your 3.3 SPECint95 system supports an average of 4 users with response time you consider adequate, and you plan to grow to an average of 8 users, then you might conclude that you will need something on the order of 6.6.

Often the biggest challenge in capacity planning is uncertainty in future workloads. You may carefully plan for growth of your existing workload, correlate your applications with individual benchmark measures, do analytic or simulation modeling to account for non-uniform workload demand, and still miss the mark. The reason? Some new application comes along, or you discover a new use of an existing application, that becomes indispensible to your business but consumes more computer resources than you planned for.

Most vendors offer capacity planning services and information, often focused primarily on their own systems. You can ask your vendor(s) sales representatives about that. There are also commercial products and consulting firms that can assist you in this work. A web search turns up lots of books on capacity planning. User groups are also good sources of capacity planning information.

Why don't your results match what I see with my application?...

Hmmmm... the best benchmark is always your own application. The SPEC CPU benchmarks do provide a comparitive measure of performance of the processor, memory, and compiler on a known workload, but they may not represent your particular situation.

Before drawing conclusions there are addition questions I would investigate:

What exactly are your hardware configuations and are they the same as those reported on the SPEC result page (size of memory, size of caches, etc.)?
Are the number of processors in the system, the same as those reported on the SPEC pages (remember that some vednors report results using parallelizing compilers for FP tasks on MP machines, etc.)?
What exactly are the versions of the OS and compiler and are they the same as those (particularly the version number) as those reported on the SPEC page?
Do you use a preprocessor where the SPEC result use a preprocessor?
What optimization flags were used for your application vs those used for the SPEC benchmarks? Did you use feedback directed optimization? Did you use a parallelizing compiler?
What are the characterisitics of your workload? Programming language? Memory usage?

It may turn out to be the case that your workload is nothing like SPEC's. SPEC considers the programs in CPU95 to be "every day" compute intensive applications in the technical arena (by nature the FP benchmarks may be more field specific than the integer benchmarks, i.e. gcc, perl and compress are probably more commonly used than tomcatv and su2cor).

If you have application source code that you feel represents the type of work you do, would you be willing to donate it to SPEC (and possibly work with SPEC to turn it into a benchmark)? SPEC is doing research for the next version of the CPU benchmarks, and any input/assistance would be appreciated.

How can I convert a SPECfp95 into MFLOPS?

Short answer: SPEC does not have a conversion factor for SPECfp95 to "quot;MFLOPS"quot;.

Longer answer: "quot;MFLOPS"quot; can be a misleading term; for example:

what workload is being referred to?
under what conditions (peak, theoretical, sustained, average, etc.)?
what counts as a floating point operations (loads, stores, intermediate steps, etc.)?

SPEC does not even calculate or provide FLOP information for the CFP95 codes. Remember these are full applications not FP kernel loops, thus there wide variations in the FLOP count during a run of one of these applications. Overall averages would underestimate badly, sampling would lead to unstable predictions, and the peak rates may or may not be sustainable in other situations.

Because of the vagueness of the term, SPEC does not encourage users to try to convert measurements to these units. Instead, the SPEC provides points of comparison of a system (CPU, memory, compiler, etc.) on known, standardized workloads.

What is a non-compliant result?

The SPEC OSG benchmarks are governed by Run and Reporting Rules and a SPEC license agreement. Results generated in a manner that is not consistent with the policies in those documents are considered non-compliant results. Non-compliant results are not comparable to results generated with those that adhere to the Run and Reporting Rules and the license agreement.

In some cases an issue of compliance may not be discovered until after a result has already been made public on SPEC's web site. If it is discovered that a non-compliant result has been published on the SPEC web site, the following occurs:

the numeric values/graphs on the web pages for the result are listed as "NC" for non-compliant.
the web pages are annotated with a statement declaring that the result was found to be non-compliant and the reason for that declaration.
Additional information (for example, typically a pointer to a result that addresses the non-compliance) may also be part of that annotation.

If it is discovered that a non-compliant result has been published someplace other than the SPEC web site, SPEC will take action appropriate for enforcing the applicable SPEC license agreement. Suspected issues with non-compliant results may be brought to SPEC's attention by contacting SPEC at info@spec.org.

What does a designation of "CD" mean?

"CD" stands for "code defect". Results marked with a "CD" designation are not comparable to any other results for that benchmark due to identified issues within the benchmark code itself. If results using this code have been published on the SPEC web site, the following occurs:

The numeric values/graphs on the web pages for the result are listed as "CD" for "code defect".
The web pages are annotated with a statement declaring that the result was generated with the defective code.
Additional information (for example, a pointer to a result generated with updated code) may also be part of that annotation.

The "CD" designation does not imply any non-compliance with benchmark's run and reporting rules.

What rules/guidelines should individuals or organizations observe when making public claims using SPEC/OSG benchmark results?

Please see the SPEC/OSG Fair Use Rule for the guidelines that must be observed when making public claims using SPEC benchmark results.

What's going on with the #CPU and #Processors fields in various SPEC benchmarks?

In the interest of providing full disclosure information, SPEC's Open Systems Group has adapted the #CPU field to contain additional information for systems using multi-core and multi-threaded microprocessors. In this new format the #CPUs field identifies:

# cores, # chips, # cores/chip [(CPU configuration details)]

where:

The term "core" is used to identify the core set of architectural, computational processing elements that provide the functionality of a CPU.
The term "chip" identifies the actual microprocessor, the physical package containing one or more "cores".
If applicable, a comment may be appended to identify any additional CPU configuration details such as enabling or disabling on-chip threading or the number of on-chip threads present.

The intent of using these new terms is to recognize that the terms "CPU" and "processor" have become overloaded terms as microprocessor architectures have evolved. A simple integer value may no longer provide sufficient or consistent accounting of the available processing units in a system.

A few recent results are already using the new format and previously published results for multi-core and multi-threaded processors are in the process of being converted to the new format by their submitters. SPEC's search and sorting tools will also be updated to support this new format.

This style of reporting will apply to all currently released SPEC OSG benchmarks. We expect that in future benchmarks, such as the next CPU suite, we will have additional disclosure categories that will better describe current and future microprocessor architectures and their performance critical features.

OSG CPU2000

What is SPEC CPU2000?

SPEC CPU2000 is a software benchmark product produced by the Standard Performance Evaluation Corp. (SPEC), a non-profit group that includes computer vendors, systems integrators, universities, research organizations, publishers and consultants from around the world. It is designed to provide performance measurements that can be used to compare compute-intensive workloads on different computer systems.

SPEC CPU2000 contains two benchmark suites: CINT2000 for measuring and comparing compute-intensive integer performance, and CFP2000 for measuring and comparing compute-intensive floating point performance.

What components do CINT2000 and CFP2000 measure?

Being compute-intensive benchmarks, they measure performance of the computer's processor, memory architecture and compiler. It is important to remember the contribution of the latter two components -- performance is more than just the processor.

Is there a way to translate SPEC CPU95 results to SPEC CPU2000 results or vice versa?

There is no formula for converting CPU95 results to CPU2000 results and vice versa; they are different products. There probably will be some correlation between CPU95 and CPU2000 results (i.e., machines with higher CPU95 results often will have higher CPU2000 results), but there is no universal formula for all systems.

SPEC strongly encourages SPEC licensees to publish CPU2000 numbers on older platforms to provide a historical perspective on performance.

Where can I find more information about CPU2000?

Information about CPU2000, including a FAQ, benchmark descriptions, and user documentation, is available at http://www.spec.org/cpu2000/.

OSG jAppServer2002

What is SPECjAppServer2002?

SPECjAppServer2002 is an industry standard benchmark designed to measure the performance of J2EE application servers.

How is SPECjAppServer2002 different than SPECjAppServer2001?

SPECjAppServer2001 adheres to the EJB 1.1 specification while SPECjAppServer2002 adheres to the EJB 2.0 specification. There are four main differences in the implementation of the SPECjAppServer2002 benchmark:

SPECjAppServer2002 has been converted to use the EJB 2.0 style CMP (Container Managed Persistence) entity beans.
SPECjAppServer2002 takes advantage of the local interface features in EJB 2.0.
SPECjAppServer2002 utilizes CMR (Container Managed Relationships) between the entity beans.
SPECjAppServer2002 uses EJB-QL in the deployment descriptors.

What is the performance metric for SPECjAppServer2002?

SPECjAppServer2002 expresses performance in terms of two metrics:

TOPS (Total Operations Per Second) which is the number of order transactions plus the number of manufacturing work orders divided by the measurement period in seconds.
price/TOPS which is the price of the System Under Test (including hardware, software, and support) divided by the TOPS.

Other SPEC benchmarks do not have a price/performance metric. Can you explain why SPECjAppServer2002 has a price/performance metric?

The lineage of SPECjAppServer2002 is ECperf which was developed under the JCP process. SPEC committees debated on the inclusion of this metric for the SPECjAppServer2001 and SPECjAppServer2002 benchmark. It was decided that SPEC will do this on experimental basis and this experiment will expire at the conclusion of the review cycle to end in 5/03/2003 and that SPECjAppServer2002 will be retired. The cost/benefit and success of this experiment will determine whether this benchmark will be continued.

Can I compare SPECjAppServer2002 results with other results?

SPECjAppServer2002 results may not be compared with SPECjAppServer2001 results, as there are different optimization opportunities and constraints in the two EJB specifications (SPECjAppServer2001 adheres to the EJB 1.1 specification while SPECjAppServer2002 adheres to the EJB 2.0 specification).

Additionally, SPECjAppServer2002 results cannot be compared with other benchmark results, such as ECperf 1.0, ECperf 1.1, TPC-C, TPC-W, or other SPEC benchmarks. See the SPECjAppServer2002 FAQ for further explanation.

Where can I go for more information?

SPECjAppServer2002 documentation consists mainly of four documents: User Guide, Design Document, Run and Reporting Rules, and the FAQ. The documents can be found in the benchmark kit or at http://www.spec.org/jAppServer2001/.

OSG jAppServer2001

What is SPECjAppServer2001?

SPECjAppServer2001 is an industry standard benchmark designed to measure the performance of J2EE application servers. This benchmark was derived from ECperf which was developed under the Java Community Process (JCP).

Why are you releasing two incomparable benchmarks in the span of three months?

The two benchmarks will be very similar except for the EJB specification used. While the EJB 2.0 specification is complete, there are vendors who are not able to publish benchmark results using the EJB 2.0 specification yet. There are also vendors who will not be able to publish benchmark results using the EJB 1.1 specification. To allow vendors in either situation to publish results it was decided to release two benchmarks, one supporting each specification. The reason that the two benchmarks are incomparable is because there are different optimization opportunities and constraints in the two EJB specifications.

What is the performance metric for SPECjAppServer2001?

SPECjAppServer2001 expresses performance in terms of two metrics:

BOPS (Business Operations Per Second) which is the number of order transactions plus the number of manufacturing work orders divided by the measurement period in seconds.
price/BOPS which is the price of the System Under Test (including hardware, software, and support) divided by the BOPS.

Other SPEC benchmarks do not have a price/performance metric. Can you explain why SPECjAppServer2001 has a price/performance metric?

The lineage of SPECjAppServer2001 is ECperf which was developed under the JCP process. SPEC committees debated on the inclusion of this metric for the SPECjAppServer2001 benchmark. It was decided that SPEC will do this on experimental basis and this experiment will expire at the conclusion of the review cycle to end in 5/03/2003 and that SPECjAppServer2001 will be retired. The cost/benefit and success of this experiment will determine whether this benchmark will be continued.

Can I compare SPECjAppServer2001 results with other results?

SPECjAppServer2001 results cannot be compared with other benchmark results, such as ECperf 1.0, ECperf 1.1, TPC-C, TPC-W, or other SPEC benchmarks. See the SPECjAppServer2001 FAQ for further explanation.

Where can I go for more information?

SPECjAppServer2001 documentation consists mainly of four documents: User Guide, Design Document, Run and Reporting Rules, and the FAQ. The documents can be found in the benchmark kit or at http://www.spec.org/jAppServer2001/.

OSG JBB2000

What is SPECjbb2000?

SPECjbb2000 is a Java program emulating a 3-tier system with emphasis on the middle tier. Random input selection represents the 1st tier user interface. SPECjbb2000 fully implements the middle tier business logic. The 3rd tier database is replaced by binary trees.

SPECjbb2000 is inspired by the TPC-C benchmark and loosely follows the TPC-C specification for its schema, input generation, and transaction profile. SPECjbb2000 replaces database tables with Java classes and replaces data records with Java objects. The objects are held by either binary trees (also Java objects) or other data objects.

SPECjbb2000 runs in a single JVM in which threads represent terminals in a warehouse. Each thread independently generates random input (tier 1 emulation) before calling transaction-specific business logic. The business logic operates on the data held in the binary trees (tier 3 emulation). The benchmark does no disk I/O or network I/O.

What specific aspects of performance does SPECjbb2000 measure?

SPECjbb2000 measures the implementation of Java Virtual Machine (JVM), Just-in-time compiler (JIT), garbage collection, threads and some aspects of the operating system. It also measures the performance of CPUs, caches, memory hierarchy and the scalability of Shared Memory Processors (SMPs) platforms on the specified commercial workload.

What metrics does SPECjbb2000 use to report performance?

SPECjbb2000 ops/second is a composite throughput measurement representing the averaged throughput over a range of points. It is described in detail in the document "SPECjbb2000 Run and Reporting Rules."

Do you provide source code for the benchmark?

Yes, but you are required to run with the jar files provided with the benchmark. Recompilation is forbidden in the run rules and will invalidate your results.

Where can I find more information about JBB2000?

Documentation, the design document, a FAQ page and submitted results may all be found at http://www.spec.org/jbb2000/

OSG JVM98

What is SPECjvm98?

SPECjvm98 is a benchmark suite that measures performance for Java virtual machine (JVM) client platforms. It contains eight different tests, five of which are real applications or are derived from real applications. Seven tests are used for computing performance metrics. One test validates some of the features of Java, such as testing for loop bounds.

What specific aspects of performance does SPECjvm98 measure?

SPECjvm98 measures the time it takes to load the program, verify the class files, compile on the fly if a just-in-time (JIT) compiler is used, and execute the test. From the software perspective, these tests measure the efficiency of JVM, JIT compiler and operating system implementations on a given hardware platform. From the hardware perspective, the benchmark measures CPU (integer and floating-point), cache, memory, and other platform-specific hardware performance.

What programs make up the JVM98 test suite?

The following eight programs make up the JMV98 test suite:

_200_check - checks JVM and Java features
_201_compress - A popular utility used to compress/uncompress files
_202_jess - a Java expert system shell
_209_db - A small data management program
_213_javac - the Java compiler, compiling 225,000 lines of code
_222_mpegaudio - an MPEG-3 audio stream decoder
_227_mtrt - a dual-threaded program that ray traces an image file
_228_jack - a parser generator with lexical analysis

Which JVM98 tests represent real applications?

_202_jess - a Java version of NASA's popular CLIPS rule-based expert system; it is distributed freely by Sandia National Labs at http://herzberg.ca.sandia.gov/jess/
_201_compress - a Java version of the LZW file compression utilities in wide distribution as freeware.
_222_mpegaudio - an MPEG-3 audio stream decoder from Fraunhofer Institut fuer Integrierte Schaltungen, a leading international research lab involved in multimedia standards. More information is available at http://www.iis.fhg.de/audio
_228_jack - a parser generator from Sun Microsystems, now named the Java Compiler Compiler; it is distributed freely at: http://www.suntest.com/JavaCC/
_213_javac - a Java compiler from Sun Microsystems that is distributed freely with the Java Development Kit at: http://java.sun.com/products

Where can I find more information about JVM98?

Documention, a FAQ page, and submitted results may be found at http://www.spec.org/jvm98/.

OSG Mail2001

What is SPECmail2001?

SPECmail2001 is an industry standard benchmark designed to measure a system's ability to act as a mail server compliant with the Internet standards Simple Mail Transfer Protocol (SMTP) and Post Office Protocol Version 3 (POP3). The benchmark models consumer users of an Internet Service Provider (ISP) by simulating a real world workload. The goal of SPECmail2001 is to enable objective comparisons of mail server products.

What is the performance metric for SPECmail2001?

SPECmail2001 expresses performance in terms of SPECmail2001 messages per minute (MPM). For example:

Messages per minute = ((Messages sent per day * Number of Users) * Percentage Peak Hour)/60 minutes
MPM can be translated into a user count as follows: one MPM equals 200 SPECmail users. For example: 1,000 users = 5 MPM, 10,000 users = 50 MPM, 100,000 users = 500 MPM, 1,000,000 users = 5000 MPM, etc.
SPECmail2001 requires the reported throughput number (MPM) to meet the following Quality of Service (QoS) criteria:
1. for each mail operation, 95% of all response times recorded must be under 5 seconds
2. 95% of all messages transferred to/from the mail server must transfer at a minimum rate of half the modem speed plus 5 seconds
3. 95% of all messages sent to remote users must be received by the remote server during the measurement period
4. 95% of all messages to local users must be delivered in 60 seconds
5. the mail server can produce no more than 1% of errors during the measurement period

What are the limitations of SPECmail2001?

The first release of SPECmail2001 does not support Internet Message Access Protocol (IMAP) or Webmail (email accessible via a browser). The plan is to include IMAP in a future SPEC mail server benchmark. Currently, there are no plans to address Webmail.

Can I use SPECmail2001 to determine the size of the mail server I need?

SPECmail2001 cannot be used to size a mail server configuration, because it is based on a specific workload. There are numerous assumptions made about the workload, which may or may not apply to other user models. SPECmail2001 is a tool that provides a level playing field for comparing mail server products that are targeted for an ISP POP consumer environment. Expert users of the tool can use the benchmark for internal stress testing, with the understanding that the test results are for internal use only.

Where can I find more information about Mail2001?

SPECmail2001 documentation consists of four documents: User Guide, Run Rules, Architecture White Paper and the FAQ. The documents can be found at http://www.spec.org/mail2001/.

OSG SFS97_R1

What is SPEC SFS97_R1?

SPEC SFS 3.0 (SFS97_R1) is the latest version of the Standard Performance Evaluation Corp.'s benchmark that measures NFS file server throughput and response time. It provides a standardized method for comparing performance across different vendor platforms. This is an incremental release based upon the design of SFS 2.0 (SFS97) and address several critical problems uncovered in that release; additionally it addresses several tools issues and revisions to the run and reporting rules.

Does this benchmark replace the SPEC SFS 2.0 suite?

Yes. SFS 2.0 was withdrawn by SPEC in June 2001 because many of its results could not be compared accurately. In particular, the set of distinct data files - called the "working set" - accessed by the SFS 2.0 benchmark was often smaller than its designers intended. Also, the distribution of accesses among files was sensitive to the number of load-generating processes specified by the tester. Technical analysis of these defects is available online at http://www.spec.org/sfs97/sfs97_defects.html.

In addition to correcting problems, SPEC SFS 3.0 includes better validation of servers and clients, reduced client memory requirements, portability to Linux and FreeBSD clients, new submission tools, and revised documentation. Further details about these changes can be found in the SPEC SFS 3.0 User's Guide included with the software.

Can SPEC SFS 3.0 results be compared with earlier results?

The results for SFS 3.0 are not comparable to results from SFS 2.0 or SFS 1.1. SFS 3.0 contains changes in the working set selection algorithm that fixes errors that were present in the previous versions. The selection algorithm in SFS 3.0 accurately enforces the originally defined working set for SFS 2.0. Also enhancements to the workload mechanism improve the benchmark's ability to maintain a more even load on the SUT during the benchmark. These enhancements affect the workload and the results. Results from SFS 3.0 should only be compared with other results from SFS 3.0.

Where can I find more information about SFS97_R1?

Documentation and submitted results may be found at http://www.spec.org/sfs97r1/.

OSG Web99

What is SPECweb99?

SPECweb99 is a software benchmark product developed by the Standard Performance Evaluation Corporation (SPEC), a non-profit group of computer vendors, systems integrators, universities, research organizations, publishers and consultants. It is designed to measure a system's ability to act as a web server for static and dynamic pages.

SPECweb99 is the successor to SPECweb96, and continues the tradition of giving Web users the most objective, most representative benchmark for measuring web server performance. SPECweb99 disclosures are governed by an extensive set of run rules to ensure fairness of results.

The benchmark runs a multi-threaded HTTP load generator on a number of driving "client" systems that will do static and dynamic GETs of a variety of pages from, and also do POSTs to, the SUT (System Under Test).

SPECweb99 provides the source code for an HTTP 1.0/1.1 load generator that will make random selections from a predetermined distribution. The benchmark defines a particular set of files to be used as the static files that will be obtained by GETs from the server, thus defining a particular benchmark workload.

The benchmark does not provide any of the web server software. That is left up to the tester. Any web server software that supports HTTP 1.0 and/or HTTP 1.1 can be used. However, it should be noted that variations in implementations may lead to differences in observed performance.

To make a run of the benchmark, the tester must first set up one or more networks connecting a number of the driving "clients" to the server under test. The benchmark code is distributed to each of the drivers and the necessary fileset is created for the server. Then a test control file is configured for the specific test conditions and the benchmark is invoked with that control file.

SPECweb99 is a generalized test, but it does make a good effort at stressing the most basic functions of a web server in a manner that has been standardized so that cross-comparisons are meaningful across similar test configurations.

What is SPECweb99 Release 1.01?

This is a minor release which fixes several issues found in the client test harness after the release. There are code changes to the module HTTP/HT.c and manager only. These changes include:

Change the code so that the 30% of requests issued that don't require the use of Keep-Alive or Persistent connections use HTTP 1.0 protocol. This will result in the server closing the connection and accruing the TCP TIME_WAIT.
Ensure that for all HTTP 1.0 requests that are keep_alive that the Connection: Keep-Alive header is included so it better parallels HTTP 1.1 persistent connections.
Ensure that HTTP 1.0 reponses include the Connection: Keep-Alive header if request included Keep-Alive header and close the connection if header was not present.
Ensure that the correct version information is in both HT.c and manager so that manager can detect that the updated Release 1.01 sources are in use.

Minor documentation edits have been made to clarify the benchmark's operation in light of the changes shown above as well as to make a few corrective edits that were missed in the initial document review. User's should review these modified sections:

Run and Reporting Rules: 2.1.1 Protocols
User_Guide: 3.1.1 Changeable Benchmark parameters - HTTP_PROTOCOL
SPECweb99 Design Document: 5.0 Keep-Alive/Persistent Connection Requests

SPEC requires that all SPECweb99 published after Nov 1, 1999 use the new Release 1.01. All licensees of SPECweb99 will be provided with an update kit which will include the source for HTTP/HT.c and all updated documentation.

What does SPECweb99 measure?

SPECweb99 measures the maximum number of simultaneous connections, requesting the predefined benchmark workload, that a web server is able to support while still meeting specific throughput and error rate requirements. The connections are made and sustained at a specified maximum bit rate with a maximum segment size intended to more realistically model conditions that will be seen on the Internet during the lifetime of this benchmark.

What if I have a problem building/running the SPECweb99 benchmark?

There may have been some issues that have been raised about the benchmark since it was released. We are keeping a SPECweb99 issues repository. If your issue is not amongst the known issues, then bring it to the attention of SPEC.

Where can I find more information about Web99?

The SPECweb99 Design Overview contains design information on the benchmark and workload. The Run and Reporting Rules and the User Guide with instructions for installing and running the benchmark are also available. See: http://www.spec.org/web99 for the available information on SPECweb99.

OSG Web99_SSL

What is SPECweb99_SSL?

SPECweb99_SSL is a software benchmark product developed by the Standard Performance Evaluation Corporation (SPEC), a non-profit group of computer vendors, systems integrators, universities, research organizations, publishers and consultants. It is designed to measure a system's ability to act as a secure web server for static and dynamic pages. SPECweb99_SSL is a software benchmark product designed to test secure web server performance using HTTP over the Secure Sockets Layer Protocol (HTTPS). The benchmark is built upon the SPECweb99 test harness and uses the same workload and file set (see: http://www.spec.org/web99/).

SPECweb99_SSL continues the tradition of giving Web users the most objective, most representative benchmark for measuring secure web server performance. SPECweb99_SSL disclosures are governed by an extensive set of run rules to ensure fairness of results.

The benchmark runs a multi-threaded HTTPS load generator on a number of driving "client" systems that will do static and dynamic GETs of a variety of pages from, and also do POSTs to, the SUT (System Under Test) over SSL.

SPECweb99_SSL provides the source code for an HTTP 1.0/1.1 over SSL (Secure Socket Layer Protocol) load generator that will make random selections from a predetermined distribution. The benchmark defines a particular set of files to be used as the static files that will be obtained by GETs from the server, thus defining a particular benchmark workload.

The benchmark does not provide any of the web server software. That is left up to the tester. Any web server software that supports HTTP 1.0 and/or HTTP 1.1 over SSL (HTTPS) can be used. However, it should be noted that variations in implementations may lead to differences in observed performance.

SPECweb99_SSL is a generalized test, but it does make a good effort at stressing the most basic functions of a secure web server in a manner that has been standardized so that cross-comparisons are meaningful across similar test configurations.

What does SPECweb99_SSL measure?

SPECweb99_SSL measures the maximum number of simultaneous connections, requesting the predefined benchmark workload, that a secure web server is able to support while still meeting specific throughput and error rate requirements. The connections are made and sustained at a specified maximum bit rate with a maximum segment size intended to more realistically model conditions that will be seen on the Internet during the lifetime of this benchmark.

Can SPECweb99_SSL results be meaningfully compared to SPECweb99 results?

No. Although the benchmarks have similarities, results from one cannot be compared to the other, since SPECweb99_SSL adds SSL encryption to the SPECweb99 workload. However for the same system, the differences between the SPECweb99 and SPECweb99_SSL results could be used to help evaluate the performance impact of SSL encryption on the server. Public comparisions of SPECweb99 and SPECweb99_SSL results would be considered a violation of the run and reporting rules.

Where can I find more information about Web99_SSL?

The SPECweb99_SSL Design Overview contains design information on the benchmark and workload. The Run and Reporting Rules and the User Guide with instructions for installing and running the benchmark are also available. See: http://www.spec.org/web99ssl for the available information on SPECweb99_SSL.

OSG Embedded

Does SPEC/OSG have any benchmarks for embedded systems?

Unfortunately, no -- and a cross-compiling environment, while necessary, would not be sufficient. The SPEC CPU benchmarks all require at least a Posix interface to an IO system over a hierarchal file system (each benchmark reads in its starting parameters and all write out significant amounts of data for validation).

Admittedly, it would be great if there were something more than Dhrystone to measure embedded systems, but the problems of widely varying architectures, minimal process support resources, and even ROM-based memory speeds, make any reasonable benchmarking of embedded systems into a very difficult problem.

There is an industry consortium that is working on benchmarks for the embedded processors. For further information contact the EDN Embedded Microprocessor Benchmark Consortium.

Standard Performance Evaluation Corporation

SPEC OSG Frequently Asked Questions

Contents: