OSG Benchmarks
Order Benchmarks
SPEC
Mirror Sites
Resources
|
SPEC OSG Frequently Asked Questions
Contents:
- OSG General
- Where can I get information about "SPEC" (not just "OSG")?
- Which system/CPU/component should I purchase?
- OSG Results
- Why can't I find a result for "XYZ"?
- What happened to the result for "XYZ"?
- How can I publish my benchmark results on the SPEC web site?
- What SPEC rating is required to support "XYZ" application?
- Why don't your results match what I see with my application?...
- How can I convert a SPECfp95 into MFLOPS?
- What is a non-compliant result?
- What does a designation of "CD" mean?
- What rules/guidelines should individuals or organizations observe when making public claims using SPEC/OSG benchmark results?
- What's going on with the #CPU and #Processors fields in various SPEC benchmarks?
- OSG CPU2000
- What is SPEC CPU2000?
- What components do CINT2000 and CFP2000 measure?
- Is there a way to translate SPEC CPU95 results to SPEC CPU2000 results or vice versa?
- Where can I find more information about CPU2000?
- OSG jAppServer2002
- What is SPECjAppServer2002?
- How is SPECjAppServer2002 different than SPECjAppServer2001?
- What is the performance metric for SPECjAppServer2002?
- Other SPEC benchmarks do not have a price/performance metric. Can you explain why SPECjAppServer2002 has a price/performance metric?
- Can I compare SPECjAppServer2002 results with other results?
- Where can I go for more information?
- OSG jAppServer2001
- What is SPECjAppServer2001?
- Why are you releasing two incomparable benchmarks in the span of three months?
- What is the performance metric for SPECjAppServer2001?
- Other SPEC benchmarks do not have a price/performance metric. Can you explain why SPECjAppServer2001 has a price/performance metric?
- Can I compare SPECjAppServer2001 results with other results?
- Where can I go for more information?
- OSG JBB2000
- What is SPECjbb2000?
- What specific aspects of performance does SPECjbb2000 measure?
- What metrics does SPECjbb2000 use to report performance?
- Do you provide source code for the benchmark?
- Where can I find more information about JBB2000?
- OSG JVM98
- What is SPECjvm98?
- What specific aspects of performance does SPECjvm98 measure?
- What programs make up the JVM98 test suite?
- Which JVM98 tests represent real applications?
- Where can I find more information about JVM98?
- OSG Mail2001
- What is SPECmail2001?
- What is the performance metric for SPECmail2001?
- What are the limitations of SPECmail2001?
- Can I use SPECmail2001 to determine the size of the mail server I need?
- Where can I find more information about Mail2001?
- OSG SFS97_R1
- What is SPEC SFS97_R1?
- Does this benchmark replace the SPEC SFS 2.0 suite?
- Can SPEC SFS 3.0 results be compared with earlier results?
- Where can I find more information about SFS97_R1?
- OSG Web99
- What is SPECweb99?
- What is SPECweb99 Release 1.01?
- What does SPECweb99 measure?
- What if I have a problem building/running the SPECweb99 benchmark?
- Where can I find more information about Web99?
- OSG Web99_SSL
- What is SPECweb99_SSL?
- What does SPECweb99_SSL measure?
- Can SPECweb99_SSL results be meaningfully compared to SPECweb99 results?
- Where can I find more information about Web99_SSL?
- OSG Embedded
- Does SPEC/OSG have any benchmarks for embedded systems?
-
Basic information about SPEC can be found in the
general SPEC area.
In particular there is a SPEC FAQ
which answers questions about SPEC the organization,
as opposed to about OSG's benchmarks.
-
Unfortunately, SPEC does not make recommendations on computer purchases.
SPEC will provide what data it can (all available from the
results index) but can not offer opinions on
it. Without knowing a lot about your exact workload, it would be difficult
to provide any useful specific advice and providing general advice can get us
all in legal trouble.
-
The results that are available on this web site are ones that have been
submitted to SPEC.
SPEC does not run any benchmarks itself; we develop the benchmarks,
we review and publish results submitted to us, but we do not have
control over which companies participate nor which systems or
configurations are submitted.
Many companies are quoting SPEC results. The licensing of the SPEC
benchmark products allows companies to quote and publish results on
their own so long as they adhere to all of the run and reporting rules
for that benchmark. There is no requirement that these results are
submitted to SPEC.
However, only those results available on this web site have been
reviewed by SPEC. Any result not available here has not been reviewed.
SPEC cannot speak for the validity of any result published elsewhere,
the correctness and completeness of any such result is the responsibility
of the licensee who has made the claim.
If there are results that you would like to see here, contact the
company responsible and suggest that they submit results to SPEC.
SPEC cannot require a result to be reviewed by us, but many vendors
will submit if there is sufficient end-user demand.
-
It is possible that a result that was available a short while ago
may no longer appear in SPEC's listings.
It is SPEC's policy to be an archive for all results published by SPEC;
however, on the rare occasions when an issue with rules compliance is
discovered after publication, SPEC may determine that the affected result(s)
need to be withdrawn.
Note that lack of compliance can be due to several factors
(e.g. oversight, unexpected benchmark behavior, etc.).
If it is discovered that a result may not be in full compliance with the
necessary Run and Reporting Rules for that benchmark, the issue is brought
before the appropriate technical committee. That technical committee reviews
the issues related to compliance and makes a recommendation to the proper
steering committee, which may then choose to have the result(s) withdrawn.
If a result is to be withdrawn, it is removed from all of the affected listings
and the full disclosures are replaced with files explaining that this result
has been withdrawn because of compliance issues.
It is expected that sometime in the near future the test sponsors
will submit new results when they have resolved these issues.
Thankfully, this is not a common occurance.
-
Benchmark licensees may submit results for publication on the SPEC
web site. Please review the general
guidelines and the review calendar ; specific
information on how to submit results can be found in each benchmark's run and
reporting rules and user guide documentation.
-
The methodology of determining the computer systems needed to support a
particular workload is capacity planning. This is a different, though
related, field to comparative benchmarking. Capacity planning requires the
adaptation of a model to fit the behavior of the desired workload for
accurate projections. Comparative benchmarking requires the explicit
standardization of a benchmark workload (or model) for accurate
comparisons. These different directions can lead to contradictory needs.
Good capacity planning models require a very good understanding of the
aspects of your workload. It is important to understand that every workload
is unique to some extent; there may be commonalities between similar tasks
and applications, but there are often significant differences in behavior
from minor differences in usage.
Some basic questions to understand before making capacity projections:
-
Are you commonly CPU bound? I/O bound? Memory constrained?
-
What are the I/O requirements?
-
What kind of system services are used, and how often?
-
Does the behavior remain consistent as the workload size is increased?
Once you understand these kind of issues with your own applications and
workloads, then you can seek out available data that can be used for input
to your model. SPEC benchmark results might be useful components for some
parts of a capacity planning model.
For example, assume I have a UNIX system rated 3.3 SPECint95 - how can I
test my application to determine at what point I will need more compute, at
what point I will need to move to a system with a higher SPECint95 rating?
Normally, SPECint95 alone would not be sufficient. You might for instance
run out of memory, I/O capacity, or network capacity before your
application ran out of CPU capacity.
You can use SPECint95 to determine rough bounds on the problem. E.g., if
your 3.3 SPECint95 system supports an average of 4 users with response time
you consider adequate, and you plan to grow to an average of 8 users, then
you might conclude that you will need something on the order of 6.6.
Often the biggest challenge in capacity planning is uncertainty in future
workloads. You may carefully plan for growth of your existing workload,
correlate your applications with individual benchmark measures, do analytic
or simulation modeling to account for non-uniform workload demand, and
still miss the mark. The reason? Some new application comes along, or you
discover a new use of an existing application, that becomes indispensible
to your business but consumes more computer resources than you planned for.
Most vendors offer capacity planning services and information, often
focused primarily on their own systems. You can ask your vendor(s) sales
representatives about that. There are also commercial products and
consulting firms that can assist you in this work. A web search
turns up lots of books on capacity planning. User groups are also good
sources of capacity planning information.
-
Hmmmm... the best benchmark is always your own application.
The SPEC CPU benchmarks do provide a comparitive measure of performance
of the processor, memory, and compiler on a known workload,
but they may not represent your particular situation.
Before drawing conclusions there are addition questions I would investigate:
- What exactly are your hardware configuations and are they the same as
those reported on the SPEC result page (size of memory, size of caches, etc.)?
- Are the number of processors in the system, the same as those reported
on the SPEC pages (remember that some vednors report results using
parallelizing compilers for FP tasks on MP machines, etc.)?
- What exactly are the versions of the OS and compiler and are they the
same as those (particularly the version number) as those reported on the
SPEC page?
- Do you use a preprocessor where the SPEC result use a preprocessor?
- What optimization flags were used for your application vs those used
for the SPEC benchmarks?
Did you use feedback directed optimization?
Did you use a parallelizing compiler?
- What are the characterisitics of your workload?
Programming language? Memory usage?
It may turn out to be the case that your workload is nothing like SPEC's.
SPEC considers the programs in CPU95 to be "every day" compute intensive
applications in the technical arena
(by nature the FP benchmarks may be more
field specific than the integer benchmarks,
i.e. gcc, perl and compress are probably more commonly used
than tomcatv and su2cor).
If you have application source code that you feel represents the type of work
you do, would you be willing to donate it to SPEC
(and possibly work with SPEC to turn it into a benchmark)?
SPEC is doing research for
the next version of the CPU benchmarks,
and any input/assistance would be appreciated.
-
Short answer: SPEC does not have a conversion factor for SPECfp95 to "quot;MFLOPS"quot;.
Longer answer: "quot;MFLOPS"quot; can be a misleading term; for example:
- what workload is being referred to?
- under what conditions (peak, theoretical, sustained, average, etc.)?
- what counts as a floating point operations (loads, stores,
intermediate steps, etc.)?
SPEC does not even calculate or provide FLOP information for the CFP95
codes. Remember these are full applications not FP kernel loops, thus
there wide variations in the FLOP count during a run of one of these
applications. Overall averages would underestimate badly, sampling
would lead to unstable predictions, and the peak rates may or may not be
sustainable in other situations.
Because of the vagueness of the term,
SPEC does not encourage users to try to convert measurements to these units.
Instead, the SPEC provides points of comparison of a system
(CPU, memory, compiler, etc.) on known, standardized workloads.
-
The SPEC OSG benchmarks are governed by Run and Reporting Rules and a SPEC license agreement. Results generated in a manner that is not consistent with the policies in those documents are considered non-compliant results. Non-compliant results are not comparable to results generated with those that adhere to the Run and Reporting Rules and the license agreement.
In some cases an issue of compliance may not be discovered until after a result has already been made public on SPEC's web site. If it is discovered that a non-compliant result has been published on the SPEC web site, the following occurs:
- the numeric values/graphs on the web pages for the result are listed as "NC" for non-compliant.
- the web pages are annotated with a statement declaring that the result was found to be non-compliant and the reason for that declaration.
- Additional information (for example, typically a pointer to a result that addresses the non-compliance) may also be part of that annotation.
If it is discovered that a non-compliant result has been published someplace other than the SPEC web site, SPEC will take action appropriate for enforcing the applicable SPEC license agreement. Suspected issues with non-compliant results may be brought to SPEC's attention by contacting SPEC at info@spec.org.
-
"CD" stands for "code defect".
Results marked with a "CD" designation are not
comparable to any other results for that benchmark due to
identified issues within the benchmark code itself. If
results using this code have been published on the SPEC web
site, the following occurs:
- The numeric values/graphs on the web pages for the result are
listed as "CD" for "code defect".
- The web pages are annotated with a statement declaring that the
result was generated with the defective code.
- Additional information (for example, a pointer to a result
generated with updated code) may also be part of that annotation.
The "CD" designation does not imply any non-compliance with benchmark's
run and reporting rules.
-
Please see the SPEC/OSG Fair Use Rule
for the guidelines that must be observed when making public claims using
SPEC benchmark results.
-
In the interest of providing full disclosure information, SPEC's Open Systems
Group has adapted the #CPU field to contain additional information for systems
using multi-core and multi-threaded microprocessors. In this new format the
#CPUs field identifies:
# cores, # chips, # cores/chip [(CPU configuration details)]
where:
- The term "core" is used to identify the core set of architectural,
computational processing elements that provide the functionality of a CPU.
- The term "chip" identifies the actual microprocessor, the physical
package containing one or more "cores".
- If applicable, a comment may be appended to identify any additional CPU
configuration details such as enabling or disabling on-chip threading or the number
of on-chip threads present.
The intent of using these new terms is to recognize that the terms "CPU"
and "processor" have become overloaded terms as microprocessor architectures
have evolved. A simple integer value may no longer provide sufficient or consistent
accounting of the available processing units in a system.
A few recent results are already using the new format and previously
published results for multi-core and multi-threaded processors are in the
process of being converted to the new format by their submitters. SPEC's
search and sorting tools will also be updated to support this new format.
This style of reporting will apply to all currently released SPEC OSG
benchmarks. We expect that in future benchmarks, such as the next CPU suite,
we will have additional disclosure categories that will better describe current
and future microprocessor architectures and their performance critical features.
-
SPEC CPU2000 is a software benchmark product produced by the Standard
Performance Evaluation Corp. (SPEC), a non-profit group that includes
computer vendors, systems integrators, universities, research organizations,
publishers and consultants from around the world. It is designed to provide
performance measurements that can be used to compare compute-intensive workloads
on different computer systems.
SPEC CPU2000 contains two benchmark suites: CINT2000 for measuring and comparing
compute-intensive integer performance, and CFP2000 for measuring and comparing
compute-intensive floating point performance.
-
Being compute-intensive benchmarks, they measure performance
of the computer's processor, memory architecture and compiler.
It is important to remember the contribution of the latter two
components -- performance is more than just the processor.
-
There is no formula for converting CPU95 results to CPU2000 results and vice versa; they
are different products. There probably will be some correlation between CPU95 and CPU2000
results (i.e., machines with higher CPU95 results often will have higher CPU2000 results),
but there is no universal formula for all systems.
SPEC strongly encourages SPEC licensees to publish CPU2000 numbers on older platforms to provide
a historical perspective on performance.
-
Information about CPU2000, including a FAQ, benchmark
descriptions, and user documentation, is available at
http://www.spec.org/cpu2000/.
-
SPECjAppServer2002 is an industry standard benchmark
designed to measure the performance of J2EE application
servers.
-
SPECjAppServer2001 adheres to the EJB 1.1 specification while
SPECjAppServer2002 adheres to the EJB 2.0 specification. There
are four main differences in the implementation of the
SPECjAppServer2002 benchmark:
- SPECjAppServer2002 has been converted to use the EJB 2.0
style CMP (Container Managed Persistence) entity beans.
- SPECjAppServer2002 takes advantage of the local interface
features in EJB 2.0.
- SPECjAppServer2002 utilizes CMR (Container Managed Relationships)
between the entity beans.
- SPECjAppServer2002 uses EJB-QL in the deployment descriptors.
-
SPECjAppServer2002 expresses performance in
terms of two metrics:
- TOPS (Total Operations Per Second) which is
the number of order transactions plus the number of
manufacturing work orders divided by the measurement
period in seconds.
- price/TOPS which is the price of the System Under
Test (including hardware, software, and support) divided
by the TOPS.
-
The lineage of SPECjAppServer2002 is ECperf which was developed under the
JCP process. SPEC committees debated on the inclusion of this metric for
the SPECjAppServer2001 and SPECjAppServer2002 benchmark. It was decided
that SPEC will do this on experimental basis and this experiment will expire
at the conclusion of the review cycle to end in 5/03/2003 and that
SPECjAppServer2002 will be retired. The cost/benefit and success of this
experiment will determine whether this benchmark will be continued.
-
SPECjAppServer2002 results may not be compared with SPECjAppServer2001
results, as there are different optimization opportunities and constraints
in the two EJB specifications (SPECjAppServer2001 adheres to the EJB 1.1
specification while SPECjAppServer2002 adheres to the EJB 2.0 specification).
Additionally, SPECjAppServer2002 results cannot be compared with other
benchmark results, such as ECperf 1.0, ECperf 1.1, TPC-C,
TPC-W, or other SPEC benchmarks. See the
SPECjAppServer2002 FAQ
for further explanation.
-
SPECjAppServer2002 documentation consists mainly of four documents:
User Guide,
Design Document,
Run and Reporting Rules,
and the FAQ. The documents
can be found in the benchmark kit or at
http://www.spec.org/jAppServer2001/.
-
SPECjAppServer2001 is an industry standard benchmark designed to
measure the performance of J2EE application servers. This benchmark
was derived from ECperf
which was developed under the
Java Community Process (JCP).
-
The two benchmarks will be very similar except for the EJB specification
used. While the EJB 2.0 specification is complete, there are vendors who are not
able to publish benchmark results using the EJB 2.0 specification yet. There are
also vendors who will not be able to publish benchmark results using the EJB 1.1
specification. To allow vendors in either situation to publish results it was decided
to release two benchmarks, one supporting each specification. The reason that the two
benchmarks are incomparable is because there are different optimization opportunities
and constraints in the two EJB specifications.
-
SPECjAppServer2001 expresses performance in
terms of two metrics:
- BOPS
(Business Operations Per Second) which is the number of order transactions plus
the number of manufacturing work orders divided by the measurement period in
seconds.
- price/BOPS which is the price of the System Under Test (including
hardware, software, and support) divided by the BOPS.
-
The lineage of SPECjAppServer2001 is ECperf which was developed under the JCP process.
SPEC committees debated on the inclusion of this metric
for the SPECjAppServer2001 benchmark. It was decided that SPEC will do this on
experimental basis and this experiment will expire at the conclusion
of the review cycle to end in 5/03/2003 and that SPECjAppServer2001 will be
retired. The cost/benefit and success of this experiment will determine whether this
benchmark will be continued.
-
SPECjAppServer2001 results cannot be compared with other
benchmark results, such as ECperf 1.0, ECperf 1.1, TPC-C,
TPC-W, or other SPEC benchmarks. See the
SPECjAppServer2001 FAQ
for further explanation.
-
SPECjAppServer2001 documentation consists mainly of four documents:
User
Guide,
Design Document, Run and Reporting Rules, and
the FAQ. The documents can be found in the
benchmark kit or at http://www.spec.org/jAppServer2001/.
-
SPECjbb2000 is a Java program emulating a 3-tier system with emphasis on the
middle tier. Random input selection represents the 1st tier user interface.
SPECjbb2000 fully implements the middle tier business logic. The 3rd tier
database is replaced by binary trees.
SPECjbb2000 is inspired by the TPC-C benchmark and loosely follows the TPC-C
specification for its schema, input generation, and transaction profile.
SPECjbb2000 replaces database tables with Java classes and replaces data records
with Java objects. The objects are held by either binary trees (also Java objects)
or other data objects.
SPECjbb2000 runs in a single JVM in which threads represent terminals in a warehouse.
Each thread independently generates random input (tier 1 emulation) before calling
transaction-specific business logic. The business logic operates on the data held in
the binary trees (tier 3 emulation). The benchmark does no disk I/O or network I/O.
-
SPECjbb2000 measures the implementation of Java Virtual Machine (JVM), Just-in-time
compiler (JIT), garbage collection, threads and some aspects of the operating system.
It also measures the performance of CPUs, caches, memory hierarchy and the scalability
of Shared Memory Processors (SMPs) platforms on the specified commercial workload.
-
SPECjbb2000 ops/second is a composite throughput measurement representing
the averaged throughput over a range of points. It is described in detail
in the document "SPECjbb2000 Run
and Reporting Rules."
-
Yes, but you are required to run with the jar files provided
with the benchmark. Recompilation is forbidden in the run rules
and will invalidate your results.
-
Documentation, the design document, a FAQ page and submitted
results may all be found at http://www.spec.org/jbb2000/
-
SPECjvm98 is a benchmark suite that measures performance for Java
virtual machine (JVM) client platforms. It contains eight different tests,
five of which are real applications or are derived from real applications.
Seven tests are used for computing performance metrics. One test validates
some of the features of Java, such as testing for loop bounds.
-
SPECjvm98 measures the time it takes to load the program, verify the
class files, compile on the fly if a just-in-time (JIT) compiler is used,
and execute the test. From the software perspective, these tests measure
the efficiency of JVM, JIT compiler and operating system implementations on
a given hardware platform. From the hardware perspective, the benchmark
measures CPU (integer and floating-point), cache, memory, and other
platform-specific hardware performance.
-
The following eight programs make up the JMV98 test suite:
-
_200_check - checks JVM and Java features
-
_201_compress - A popular utility used to compress/uncompress files
-
_202_jess - a Java expert system shell
-
_209_db - A small data management program
-
_213_javac - the Java compiler, compiling 225,000 lines of code
-
_222_mpegaudio - an MPEG-3 audio stream decoder
-
_227_mtrt - a dual-threaded program that ray traces an image file
-
_228_jack - a parser generator with lexical analysis
-
- _202_jess - a Java version of NASA's popular CLIPS rule-based expert
system; it is distributed freely by Sandia National Labs at
http://herzberg.ca.sandia.gov/jess/
- _201_compress - a Java version of the LZW file compression utilities in
wide distribution as freeware.
- _222_mpegaudio - an MPEG-3 audio stream decoder from Fraunhofer Institut
fuer Integrierte Schaltungen, a leading international research lab involved
in multimedia standards. More information is available at
http://www.iis.fhg.de/audio
- _228_jack - a parser generator from Sun Microsystems, now named the Java
Compiler Compiler; it is distributed freely at:
http://www.suntest.com/JavaCC/
- _213_javac - a Java compiler from Sun Microsystems that is distributed
freely with the Java Development Kit at:
http://java.sun.com/products
-
Documention, a FAQ page, and submitted results may be found at
http://www.spec.org/jvm98/.
-
SPECmail2001 is an industry standard benchmark designed to measure a system's ability
to act as a mail server compliant with the Internet standards Simple Mail Transfer
Protocol (SMTP) and Post Office Protocol Version 3 (POP3). The benchmark models consumer
users of an Internet Service Provider (ISP) by simulating a real world workload. The goal
of SPECmail2001 is to enable objective comparisons of mail server products.
-
SPECmail2001 expresses performance in terms of SPECmail2001 messages
per minute (MPM). For example:
- Messages per minute = ((Messages sent per day * Number of Users) * Percentage Peak Hour)/60
minutes
- MPM can be translated into a user count as follows: one MPM equals 200 SPECmail users. For
example: 1,000 users = 5 MPM, 10,000 users = 50 MPM, 100,000 users = 500 MPM, 1,000,000
users = 5000 MPM, etc.
- SPECmail2001 requires the reported throughput number (MPM) to meet the following Quality of Service (QoS)
criteria:
- for each mail operation, 95% of all response times recorded must be under 5 seconds
- 95% of all messages transferred to/from the mail server must transfer at a minimum
rate of half the modem speed plus 5 seconds
- 95% of all messages sent to remote users must be received by the remote server during
the measurement period
- 95% of all messages to local users must be delivered in 60 seconds
- the mail server can produce no more than 1% of errors during the measurement period
-
The first release of SPECmail2001 does not support Internet Message Access
Protocol (IMAP) or Webmail (email accessible via a browser). The plan is to
include IMAP in a future SPEC mail server benchmark. Currently, there are no
plans to address Webmail.
-
SPECmail2001 cannot be used to size a mail server configuration, because it
is based on a specific workload. There are numerous assumptions made about the
workload, which may or may not apply to other user models. SPECmail2001 is a
tool that provides a level playing field for comparing mail server products that
are targeted for an ISP POP consumer environment. Expert users of the tool can use
the benchmark for internal stress testing, with the understanding that the test
results are for internal use only.
-
SPECmail2001 documentation consists of four documents: User Guide,
Run Rules, Architecture White Paper and the FAQ. The documents can be
found at http://www.spec.org/mail2001/.
-
SPEC SFS 3.0 (SFS97_R1) is the latest version of the Standard Performance
Evaluation Corp.'s benchmark that measures NFS file server throughput and
response time. It provides a standardized method for comparing performance
across different vendor platforms. This is an incremental release based
upon the design of SFS 2.0 (SFS97) and address several critical problems
uncovered in that release; additionally it addresses several tools issues
and revisions to the run and reporting rules.
-
Yes. SFS 2.0 was withdrawn by SPEC in June 2001 because many of its results
could not be compared accurately. In particular, the set of distinct data
files - called the "working set" - accessed by the SFS 2.0 benchmark
was often smaller than its designers intended. Also, the distribution of accesses
among files was sensitive to the number of load-generating processes specified by
the tester. Technical analysis of these defects is available online at
http://www.spec.org/sfs97/sfs97_defects.html.
In addition to correcting problems, SPEC SFS 3.0 includes better validation of servers
and clients, reduced client memory requirements, portability to Linux and FreeBSD clients,
new submission tools, and revised documentation. Further details about these changes can be
found in the SPEC SFS 3.0 User's Guide included with the software.
-
The results for SFS 3.0 are not comparable to results from SFS 2.0 or
SFS 1.1. SFS 3.0 contains changes in the working set selection algorithm
that fixes errors that were present in the previous versions. The selection
algorithm in SFS 3.0 accurately enforces the originally defined working set for
SFS 2.0. Also enhancements to the workload mechanism improve the benchmark's
ability to maintain a more even load on the SUT during the benchmark. These
enhancements affect the workload and the results. Results from SFS 3.0 should
only be compared with other results from SFS 3.0.
-
Documentation and submitted results may be found at
http://www.spec.org/sfs97r1/.
-
SPECweb99 is a software benchmark product
developed by the Standard Performance Evaluation Corporation (SPEC), a
non-profit group of computer vendors, systems integrators, universities,
research organizations, publishers and consultants. It is designed to
measure a system's ability to act as a web server for static and dynamic
pages.
SPECweb99 is the successor to SPECweb96, and continues the tradition
of giving Web users the most objective, most representative benchmark for
measuring web server performance. SPECweb99 disclosures are governed by
an extensive set of run rules
to ensure fairness of results.
The benchmark runs a multi-threaded HTTP load generator on a number
of driving "client" systems that will do static and dynamic GETs
of a variety of pages from, and also do POSTs to, the SUT (System Under
Test).
SPECweb99 provides the source code for an HTTP 1.0/1.1 load generator
that will make random selections from a predetermined distribution. The
benchmark defines a particular set of files to be used as the static files
that will be obtained by GETs from the server, thus defining a particular
benchmark workload.
The benchmark does not provide any of the web server software. That
is left up to the tester. Any web server software that supports HTTP 1.0
and/or HTTP 1.1 can be used. However, it should be noted that variations
in implementations may lead to differences in observed performance.
To make a run of the benchmark, the tester must first set up one or
more networks connecting a number of the driving "clients" to
the server under test. The benchmark code is distributed to each of the
drivers and the necessary fileset is created for the server. Then a test
control file is configured for the specific test conditions and the benchmark
is invoked with that control file.
SPECweb99 is a generalized test, but it does make a good effort at stressing
the most basic functions of a web server in a manner that has been standardized
so that cross-comparisons are meaningful across similar test configurations.
-
This is a minor release which fixes several issues found in the client test
harness after the release. There are code changes to the module HTTP/HT.c
and manager only. These changes include:
-
Change the code so that the 30% of requests issued that don't require
the use of Keep-Alive or Persistent connections use HTTP 1.0 protocol.
This will result in the server closing the connection and accruing the
TCP TIME_WAIT.
-
Ensure that for all HTTP 1.0 requests that are keep_alive that the
Connection: Keep-Alive header is included so it better parallels HTTP 1.1
persistent connections.
-
Ensure that HTTP 1.0 reponses include the Connection: Keep-Alive header
if request included Keep-Alive header and close the connection if header
was not present.
-
Ensure that the correct version information is in both HT.c and manager
so that manager can detect that the updated Release 1.01 sources are in
use.
Minor documentation edits have been made to clarify the benchmark's
operation in light of the changes shown above as well as to make a few
corrective edits that were missed in the initial document review.
User's should review these modified sections:
-
Run and Reporting Rules: 2.1.1 Protocols
-
User_Guide: 3.1.1 Changeable Benchmark parameters - HTTP_PROTOCOL
-
SPECweb99 Design Document: 5.0 Keep-Alive/Persistent Connection Requests
SPEC requires that all SPECweb99 published after Nov 1, 1999 use the new
Release 1.01. All licensees of SPECweb99 will be provided with an update
kit which will include the source for HTTP/HT.c and all updated
documentation.
-
SPECweb99 measures the maximum number of simultaneous connections, requesting the
predefined benchmark workload, that a web
server is able to support while still meeting specific throughput and error
rate requirements. The connections are made and sustained at a specified
maximum bit rate with a maximum segment size intended to more realistically
model conditions that will be seen on the Internet during the lifetime
of this benchmark.
-
There may have been some issues that have been raised about the benchmark
since it was released. We are keeping a
SPECweb99 issues repository. If your issue is not amongst the known issues,
then bring it to the attention of SPEC.
-
The SPECweb99 Design Overview contains design information
on the benchmark and workload. The Run and Reporting Rules and
the User Guide with instructions for installing and running the
benchmark are also available. See:
http://www.spec.org/web99
for the available information on SPECweb99.
-
SPECweb99_SSL is a software benchmark product developed by the Standard Performance
Evaluation Corporation (SPEC), a non-profit group of computer vendors, systems integrators,
universities, research organizations, publishers and consultants. It is designed to measure
a system's ability to act as a secure web server for static and dynamic pages. SPECweb99_SSL
is a software benchmark product designed to test secure web server performance using HTTP over
the Secure Sockets Layer Protocol (HTTPS). The benchmark is built upon the SPECweb99 test
harness and uses the same workload and file set (see:
http://www.spec.org/web99/).
SPECweb99_SSL continues the tradition of giving Web users the most objective, most representative
benchmark for measuring secure web server performance. SPECweb99_SSL disclosures are governed by an
extensive set of run rules to ensure fairness of results.
The benchmark runs a multi-threaded HTTPS load generator on a number of driving "client" systems
that will do static and dynamic GETs of a variety of pages from, and also do POSTs to, the SUT (System
Under Test) over SSL.
SPECweb99_SSL provides the source code for an HTTP 1.0/1.1 over SSL (Secure Socket Layer Protocol) load
generator that will make random selections from a predetermined distribution. The benchmark defines a
particular set of files to be used as the static files that will be obtained by GETs from the server,
thus defining a particular benchmark workload.
The benchmark does not provide any of the web server software. That is left up to the tester. Any web
server software that supports HTTP 1.0 and/or HTTP 1.1 over SSL (HTTPS) can be used. However, it should
be noted that variations in implementations may lead to differences in observed performance.
To make a run of the benchmark, the tester must first set up one or more networks connecting a number of
the driving "clients" to the server under test. The benchmark code is distributed to each of the
drivers and the necessary fileset is created for the server. Then a test control file is configured for the
specific test conditions and the benchmark is invoked with that control file.
SPECweb99_SSL is a generalized test, but it does make a good effort at stressing the most basic functions of
a secure web server in a manner that has been standardized so that cross-comparisons are meaningful across
similar test configurations.
-
SPECweb99_SSL measures the maximum number of simultaneous connections, requesting the
predefined benchmark workload, that a secure web server is able to support while still
meeting specific throughput and error rate requirements. The connections are made and
sustained at a specified maximum bit rate with a maximum segment size intended to more
realistically model conditions that will be seen on the Internet during the lifetime of
this benchmark.
-
No. Although the benchmarks have similarities, results from one cannot be compared
to the other, since SPECweb99_SSL adds SSL encryption to the SPECweb99 workload.
However for the same system, the differences between the SPECweb99 and SPECweb99_SSL
results could be used to help evaluate the performance impact of SSL encryption on
the server. Public comparisions of SPECweb99 and SPECweb99_SSL results would be
considered a violation of the run and reporting rules.
-
The SPECweb99_SSL Design Overview contains design information
on the benchmark and workload. The Run and Reporting Rules and
the User Guide with instructions for installing and running the
benchmark are also available. See:
http://www.spec.org/web99ssl
for the available information on SPECweb99_SSL.
-
Unfortunately, no -- and a cross-compiling environment, while necessary,
would not be sufficient. The SPEC CPU benchmarks all require at least a
Posix interface to an IO system over a hierarchal file system (each
benchmark reads in its starting parameters and all write out significant
amounts of data for validation).
Admittedly, it would be great if there were something more than Dhrystone
to measure embedded systems, but the problems of widely varying
architectures, minimal process support resources, and even ROM-based memory
speeds, make any reasonable benchmarking of embedded systems into a very
difficult problem.
There is an industry consortium that is working on benchmarks for the
embedded processors. For further information contact the EDN Embedded Microprocessor Benchmark
Consortium.
|