Benchmark
Descriptions |
 |
|
DIGITAL Servers for Windows NT Family
This document is a companion piece to the
DIGITAL Servers for Windows NT Performance Flash.
It provides summary descriptions (written by the
sponsoring organization) and significance of the
industry- or de facto-standard benchmark results
for the following tests:
Application: Exchange LoadSim and Lotus
NotesBench
Compute & Throughput: AIM Server for
NT (Domain Server Mix, File Server Mix) and
AIM Suite VII for UNIX; Ziff-Davis (NetBench,
ServerBench, and WebBench); and the SPEC
Suite.
Database: Transaction Processing Council
(TPC-C)
Metrics: benchmarks can be used
for measurement in various ways:
Speed measurement how fast does
this test run, based only on speed of
computation?
Throughput measurement how does the
system do on an overall test score,
considering a representative mix of
computational and I/O tasks involved in this
application?
Some benchmarks determine a "high
water mark" or maximum workload
supported by this system for a particular
application or test
Each of the benchmarks offers different
metrics to capture a quantified measure of
processor compute speed and/or maximum workload,
as opposed to a quantification of the system
throughput.
For each test, we offer a summary description
of the test (extracted from the website of the
sponsoring vendor or consortium), along with
commentary re: DIGITALs experience and
differentiation. URL hyperlinks (for Word97) are
provided to each sponsoring organizations
web page, where detailed information is
available.
Application Benchmarks
Microsoft
Exchange (LoadSim) http://www.backoffice.microsoft.com/downtrial/moreinfo/loadsimulator.asp
Vendor
Summary: Microsoft Exchange Load Simulator
(LoadSim) Version 5.5 is a multi-client messaging
emulation tool for the MAPI protocol. It is used
to test the performance of Microsoft Exchange
Server under varying message loads. Specific
workloads can be defined in Load Simulator to
exercise servers in a controlled manner. This
information can then be used to help determine
the maximum and optimum number of users per
server, identify performance bottlenecks and
evaluate server hardware performance.
This test measures the overall performance of
Exchange Server when used as a messaging
platform. Typical client actions that are
simulated by LoadSim include creating, reading,
deleting, and replying to messages of varying
sizes. The test involves applying a standard load
and measuring response time. This test used the
standard LoadSim "medium user" profile,
which is similar to the typical business user. A
valid LoadSim test is recorded when the 95
percent of the LoadSim test cycles had a response
time of less than one second and there were no
leftover work items in the Exchange Server
queues.
Metrics: The Exchange LoadSim
tests yield metrics for maximum number of users
supported (LoadSim Exchange Users) under
different types of workload mixes.
DIGITAL Commentary: The LoadSim
benchmark is a comprehensive test, measuring
overall throughput of the system. The benchmark
simulates a certain amount of mail load (quantity
and size of mail) over a certain period to be
delivered ("send queue" emptied) to
another server within a certain time frame. By
using different LoadSim settings, you may be able
to increase the number of users supported, but
actual delivery of that mail may be allowed to
take more time. To perform these tests, DIGITAL
used the medium workload setting.
The DIGITAL Server Models 7105 and 7310 offer
the best performance and scalability for
enterprise Exchange installations in the
industry. Where server consolidation is a
customer goal, the additional scalability of the
Alpha-based 7310 will contribute to lower cost of
ownership. The Intel-based DIGITAL Server 7105
beats competitors offerings on
price/performance, based on a comparison of
results that are publicly available. Independent
consultants have written about the choice of
Alpha for Exchange; please refer to the following
reports for additional information. http://www.digital.com/messaging
"Rightsizing Microsoft
Exchange: When to Choose an ALPHA
Solution" (Sept 97) Creative
Networks, Inc. "Bringing NT
and Exchange to the Enterprise---Digital
Takes a Commanding Lead", Aberdeen
Group (Nov 1997)
|
An interactive
Cost of Ownership calculation tool for Microsoft
Exchange developed by Creative Networks, Inc. can
be downloaded from http://www.partner.digital.com/sbu/9markets/mailmessage/salestools/serveredge/cooserver1.html
II. Lotus Domino NotesBench http://www.notesbench.org/
Vendor Summary : NotesBench is a tool developed by Lotus
Development Corporation that enables hardware
vendors and distribution partners to directly
provide Lotus Notes customers with relative
capacity and performance information on various
platforms and configurations. NotesBench is a
performance characterization tool that simulates
client load using remote emulators, executing
transactions against the server under test. It is
a closed tool; test results can be disclosed only
after a successful audit by an independent
company. In this way, each vendor's result can be
directly compared.
There are five test suites that comprise
NotesBench; however, the most commonly measured
suite is the active mail users test. In this
workload, users perform mail and simple shared
database operations, and the reporting metric is
the maximum number of users that can be supported
before response time becomes unacceptable.
Metrics: Results are provided in
number of users supported, and the throughput
measure is NotesMarks.
DIGITAL Commentary: DIGITAL continues
to hold the high watermark in the battle for
Windows NT NotesBench supremacy, achieving record
results in the number of concurrent mail users
supported. DIGITAL tested a range of Intel- and
Alpha-based servers. At 6,000 Mail users, a quad
processor DIGITAL Server 7305R (533 Mhz Alpha
processors) exceeded all NotesBench results to
date on any Windows NT Server! The DIGITAL Server
5305 (Alpha-based dual processor) supports 4,000
Mail users. Both results surpassed their nearest
competitor by almost 17%! The DIGITAL Server 3305
(Alpha-based single processor) supports 2,000
Mail users. On the Intel side, the DIGITAL Server
7105 (four-processor Intel Pentium Pro) delivers
5,160 Mail users, while a DIGITAL Server 3200
(single processor Pentium Pro) delivers 1,950
Mail users.
DIGITAL has tested its servers using three
Notesbench workloads - Mail Only, MailDB, and
Groupware -roughly corresponding to Light,
Medium, and Heavy usage. For more information on
DIGITAL Servers with Lotus Domino solutions, see
the FAQs at URL http://www.digital.com/messaging/lotus/lotusfaq.html
Based upon audited results available to date,
DIGITAL Servers offer the best scalability and
server consolidation benefits in the industry, as
documented in the following Seybold Group paper.
"DIGITAL Servers in a
Notes/Domino Environment: Meeting
Enterprise CustomerNeeds" Patricia
Seybold Group (Sept.1997) |
Compute
& Throughput Benchmarks
AIM
Benchmarks http://www.aim.com/
The AIM Server Benchmark for Windows NT http://www.aim.com/NT_server.html
Sponsor Summary: AIM Technologys Server Benchmark
for Windows NT is a system-level
WIN32-compliant Benchmark for the Microsoft
Windows NT operating system. This benchmark
utilizes AIMs proven load-mix modeling
technology in a multi-threading and
multi-processing environment. It is designed to
test overall system performance of
standard Windows NT Server configurations on
Alpha and Intel platforms.
AIM Technology uses Load/Mix Modeling to test
how well servers perform under different
application loads. The role of Load/Mix modeling
is to allow AIM to apply any type of load to a
system running the Windows NT operating system.
The benchmark includes a pre-defined set of
application mixes to model the most general uses
of server systems. Two initial application mixes
for the Server Benchmark are: Domain Server Mix
and File Server Mix.
Domain Server Mix v2.0 /Windows NT
The AIM Domain Server Mix/Windows NT is
composed of 50 different tests from all subsystem
categories. The Domain Server Mix represents a
balanced usage of subsystems that are configured
as a typical enterprise shared server. The major
tasks performed by the typical domain server
include light file transfers, network routing and
packet forwarding, email, shared applications
such as spreadsheets and word processors, and
network maintenance.
File Server Mix v2.0 /Windows NT
The AIM File Server Mix/Windows NT is composed
of 37 different tests from all major subsystem
categories. The File Server Mix represents a
balanced usage of subsystems that are configured
as a gateway file server. The major tasks
performed by these file servers include file
transfers of various sizes (both synchronous and
asynchronous), network routing and packet
forwarding, system security and access permission
checking, heavy memory usage and IPC calls.
The AIM MultiUser Suite VII for UNIX
Servers
Multiuser systems are used for a wide variety
of reasons. The AIM multiuser benchmark was
designed to test the performance of systems
ranging from compute servers to file servers, as
well as multiuser systems that are used primarily
to maintain databases. This benchmark runs on the
most advanced systems and tests features required
by modern Open Systems multiuser environments.
The benchmark includes two standard mixes of
tests that cover the "standard" uses of
large computer systems. If the system is heavily
used as a shared application server or a file
server, you can use one of the standard mixes to
test the system. The standard mixes for the
multiuser benchmark are the Multiuser/Shared
System and File Server.
The Multiuser/Shared
Application Server Mix models a multiuser
environment emphasizing office automation; word
processing, spreadsheet, email, database,
payroll, and data processing. This mix represents
a broad use of different applications, as opposed
to a great deal of emphasis on one type of
application. This mix of tests models the wide
variety of operations that are commonly found on
shared multiuser systems. The mix includes
substantial testing of calculations, file system
interaction, shell operations and executes. There
is also some emphasis placed on Interprocess
Communications (IPC).
The File Server Mix models many integer
compute file system operations in heavy
concentration. This mix helps users measure the
machine's I/O capabilities. Some emphasis is
placed upon non-I/O issues including integer
calculations, data searches and system
interactions. All tests are run locally on the
system and do not require a network connection.
DIGITAL Commentary: The performance of
the DIGITAL Server line on the AIM benchmarks is
well established by the AIM Hot Iron Awards (See
URL: http://www.aim.com/pm_awards.html).
Digital won 10 Awards in April 1998, more than
any other vendor, continuing its performance
sweep.
DIGITAL runs the MultiUser Suite VII tests
under the SCO UnixWare 2.1.1 operating system on
Intel-based DIGITAL Servers.
II. Ziff-Davis Inc. Benchmarks http://www1.zdnet.com/zdbop/zdbop2.html
Sponsor Summary:
NetBench 5.01 -- measures the
performance of a file server by measuring how
well it handles file I/O requests from as many as
four different client types: DOS, 32-bit
Windows, 16-bit Windows, and/or Mac OS
systems. The clients pelt the server with
requests for network file operations. Each client
tallies how many bytes of data it moves to and
from the server and how long the process takes.
The client uses this information to calculate its
throughput for that test mix. NetBench adds all
the client throughputs together to produce the
overall throughput for a server. Latest release:
4/21/97.
http://www1.zdnet.com/zdbop/netbench/netbench.html
ServerBench
4.01 -- measures the performance of application
servers in a client/server environment by running
tests that produce different types of load on the
server. The ServerBench test environment includes
the server you're testing, its PC clients, and a
PC designated as the controller (you execute
and monitor test suites from the controller).
The clients and the controller must run either
Windows 95 or Windows NT. The server may run any
one of a number of operating systems. Latest
release: 12/19/97.
http://www1.zdnet.com/zdbop/svrbench/svrbench.html
WebBench
™ 2.0 -- measures the performance of Web
server software by returning two overall server
results: the total requests per second the Web
server handled for all the clients in the test
and the server throughput (in bytes per second).
WebBench provides both static standard test
suites and dynamic standard test suites. The
static test suites access only HTML, GIF, and a
few sample executable
files. They do not run any programs on the
server. The dynamic test suites execute
applications that actually run on the server.
They use CGI applications created for several
server platforms.
http://www1.zdnet.com/zdbop/webbench/webbench.html
The following
server evaluation is from the November 1997
Personal Computer Magazine, published by
Ziff-Davis: Pentium Pro and Pentium II Server
Review: DIGITAL Servers (formerly Prioris) shine
across the board! http://www.zdnet.com/products/content/pccg/1011/pccg0102.html
NetBench: The
NetBench test showcases a server's ability to
traffic files from different types and numbers of
clients. In other words, it measures I/O
throughput. The Digital Prioris delivered speedy
performance even when more than 32 clients were
making requests. The other servers tended to lose
steam under heavy loads, with the Polywell Poly
2X266TD2 bringing up the rear.
WebBench: If your Web site is underperforming
at bringing in revenues, it's quite possible that
your server is just plain underperforming. In Web
server benchmarks, Pentium Pro systems like the
SAG STF QuadPro strained when more than 32
requests were made. The Digital Prioris shone
across the board, even when handling multiple
requests.
ServerBench: Forget I/O for a minute and ask
yourself how fast a server delivers applications.
The Xi NetRAIDer soared with up to 4 clients
attached, but stumbled with 60.The 266MHz
Pentiums outpaced the 200MHz SAG STF QuadPro when
loads were low, but SAG's server pulled ahead
when network traffic resembled the Santa Monica
Freeway at rush hour. The Digital Prioris HX 6266
was the only system that held up nicely across
the board.
DIGITAL Commentary: We appreciate
ZDs complimentary evaluation of the
Intel-based DIGITAL Server (formerly Prioris)
family, and would note in addition only that
since this lab evaluation was conducted, that
DIGITAL has introduced two upgrades of its
Intel-based DIGITAL Server models.
II. The Standard Performance Evaluation
Corporation (SPEC Suite) http://www.specbench.org/
SPEC CPU95: Metrics include SPECint95,
SPECfp95, SPECint_base95, SPECint_rate95, etc.
for Integer and Floating Point compute speed, and
SPECrates for throughput. The benchmark was
announced in August '95.
Sponsor Summary: SPEC is a non-profit
corporation formed to establish and maintain
computer benchmarks for measuring component (C)
and system-level (S) computer performance. SPEC95
is a software benchmark product produced by SPEC.
It was designed to provide comparable measures of
performance for comparing compute-intensive
workloads on different computer systems. SPEC95
contains two suites of benchmarks:
CINT95: for measuring/comparing
compute-intensive integer performance (commercial
applications).
CFP95: for measuring/comparing
compute-intensive floating point performance
(scientific/numeric).
Being compute-intensive benchmarks, these
benchmarks emphasize the performance of the
computer's processor, the memory architecture and
the compiler. It is important to remember the
contribution of the latter two components;
performance is more than just the processor. The
CINT95 and CFP95 benchmarks do not stress other
computer components such as I/O (disk drives),
networking or graphics, as the percentage of time
spent in operating system and I/O functions is
generally negligible. Note that it may be
possible to configure a system in such a way that
one or more of these components impact the
performance of CINT95 and CFP95. However, that is
not the intent of the suites.
Peak (optimized) vs.
'Baseline' (conservative) measurements
In 1994, the SPEC Open Systems Steering
Committee decided to introduce "baseline
results." The results (for both speed and
throughput measurements) have to be measured with
more restrictive run rules, regulating the use of
compiler/linker optimization options
("flags"). As a general guideline, a
system vendor is expected to endorse the general
use of the baseline options by customers who seek
to achieve good application performance.
The intention is that baseline results
represent the performance a not- so-sophisticated
user would achieve, whereas the traditional
"peak" rules allow a selection of
optimization flags that is more typical for
sophisticated users. When SPEC's CPU benchmark
results are reported, the reports must include
baseline results. Baseline-only reporting is
allowed. A test sponsor is free to mention only
peak results in marketing literature, but
baseline results must be available and provided
upon request.
The base metrics (i.e.,
"SPECint_base95") are required for all
reported results and have set guidelines for
compilation (i.e., the same flags must be used in
the same order for all benchmarks). The non-base
metrics (i.e., "SPECint95") are
optional and have less strict requirement (i.e.,
different compiler options may be used on each
benchmark.
1. Speed Measurement
There are several different ways to measure
computer performance. One way is to measure how
fast the computer completes a single task; this
is a speed measure. The SPEC speed metrics (i.e.,
SPECint95) are used for comparing the ability of
a computer to complete single tasks. The results
("SPEC Ratio" for each individual
benchmark) are expressed as the ratio of the wall
clock time to execute one single copy of the
benchmark, compared to a fixed "SPEC
reference time". For the CPU95 benchmarks, a
Sun SPARCstation 10/40 was chosen as the
reference machine.
The following metrics ("weighted
averages" which are geometric means of 8-10
individual tests) have been defined for speed
measurements with the CPU95 benchmarks:
SPECint_base95
SPECfp_base95
SPECint95
SPECfp95
2. Throughput (Rate) Measurement
Another way to measure performance is to
determine how many tasks a computer can
accomplish in a certain amount of time; this is
called a throughput, capacity or rate measure.
The SPEC rate measures (i.e., SPECint_rate95) the
throughput or rate of a machine carrying out a
number of tasks. The results express how many
jobs of a particular type (characterized by the
individual benchmark) can be executed in a given
time. (The SPEC reference time happens to be one
24-hour day, with the execution times normalized
with respect to SPEC reference machine). The SPEC
rates therefore characterize the capacity of a
system for compute-intensive jobs of similar
characteristics. Similar to the speed metric,
SPEC has defined averages for throughput metrics:
SPECint_rate_base95
SPECfp_rate_base95
SPECint_rate95
SPECfp_rate95
Note: Because of the different units, the
values SPECint95/SPECfp95 and
SPECrate_int95/SPECrate_fp95 cannot be compared
directly.
The appropriate SPEC benchmark or metrics to
use will depend on the customers
performance requirements. For example, a single
user running a compute-intensive integer program
may only be interested in SPECint95 or
SPECint_base95. On the other hand, a person who
maintains a machine used by multiple scientists
running floating-point simulations may be more
concerned with SPECfp_rate95 or
SPEC95fp_rate_base95.
DIGITAL Commentary: Because SPEC
measures computational performance, DIGITAL runs
these tests for Alpha-based models of the DIGITAL
Server line. The superior performance results for
NT environments (generally a factor of 2-3 times
better than the results attainable using 32-bit
hardware) are attributable to Alphas 64-bit
hardware and floating point computational
capabilities. DIGITAL believes that communities
and markets benefiting from fast computational
performance may wish to avail themselves of the
benefits of the Windows NT operating environment.
SPECweb96 http://www.specbench.org/osg/web96/
A standardized
benchmark for WWW servers, announced in July '96.
It measures basic GET performance of static
pages. The benchmark runs a HTTP engine on a
number of driving "client" systems that
will GET a variety of pages from the server that
is being tested.
DIGITAL runs the SPECweb suite on the Alpha
platform for the UNIX environment, with
industry-leading results in each processor class
(1P, 2P, 4P). We believe that as Windows NT
becomes commonly-used within the scientific
computing community over the next few years, that
Alpha NT will continue to lead the pack. DIGITAL
will consider an NT-specific Intranet or Internet
test suite when a defacto standard emerges for
NT.
Database Benchmarks
I. TPC-C http://www.tpc.org/
Vendor Summary: TPC-C is a de facto
industry standard for On-Line Transaction
Processing (OLTP). The test includes 5 different
types of transactions:
New-order: enter a new order from a customer
Payment: update customer balance to reflect a
payment
Delivery: deliver orders (done as a batch
transaction)
Order-status: retrieve status of
customers most recent order
Stock-level: monitor warehouse inventory
Metrics generated include tpm-C
(transactions per minute) and $/tpm-C.
An extract from the Transaction Processing
Councils Web Site (FAQs)
Q: What do TPC throughput numbers mean?
A: Throughput, in TPC terms, is a measure of
maximum sustained system performance. In TPC-C,
throughput is defined as how many New-Order
transactions per minute a system generates while
the system is executing four other transactions
types (Payment, Order-Status, Delivery,
Stock-Level). All five TPC-C transactions have a
certain user response time requirement, with the
New-Order transaction response time set at 5
seconds. Therefore, for a 710 tpmC number, a
system is generating 710 New-Order transactions
per minute while fulfilling the rest of the TPC-C
transaction mix workload.
Q: What do the TPC's price/performance numbers
mean?
A: TPC's price/performance numbers (e.g., $550
per tpmC) include much more that just the initial
cost of the computer or host machine. In general,
TPC benchmarks are system-wide benchmarks,
encompassing almost all cost dimensions of an
entire system environment the user might
purchase, including terminals, communications
equipment, software (transaction monitors and
database software), computer system or host,
backup storage, and three years maintenance cost.
Therefore, if the total system cost is $859,100
and the throughput is $1562 tpmC, the
price/performance is derived by taking the price
of the entire system ($859,100) divided by the
performance (1562 tpmC), which equals $550 per
tpmC.
Q: There are two ways to look at TPC results:
performance and price/performance. Is one more
important and how do I know which system has the
best TPC result?
A: Either performance or price/performance may
be more important, depending on your application.
If your application environment demands very
high, mission-critical performance, then
obviously you may want to give more weight to the
TPC's throughput metric. On the other hand, most
users are shopping within a given price range and
any throughput number must be balanced against
the cost of the system. Generally, the best TPC
results combine high throughput with low
price/performance.
DIGITAL Commentary: DIGITAL runs TPC-C
tests for most models of the DIGITAL Server line.
In general, we find that the Intel-based models
of the DIGITAL Server line are among the
price/perfomance leaders, and the Alpha-based
models provide additional scalability and faster
compute performance (but not necessarily faster
I/O performance) than the Intel models. Both of
these factors should be considered specific to
the customers application environment when
making a choice among DIGITAL Server Models.
II.
TPC-D http://www.tpc.org/
DIGITAL
Commentary: TPC-D is a test suite used for
database queries in a data warehousing
environment. DIGITAL currently does not run these
tests under Windows NT; however, results are
posted for data warehousing tests on AlphaServer
platform running DIGITAL UNIX. DIGITAL has run
tests for a demo database performing sales &
marketing queries on Oracle in an NT environment;
these results compare Intel and Alpha platform
performance and will be published at URL: http://www.digital.com/info/performance.dir.html
DIGITAL evaluated the performance of the
DIGITAL Server family using industry-standard
benchmarks. These benchmarks allow comparisons
across vendors systems. Performance
characterization is just one "data
point" to be used in conjunction with other
purchase criteria such as features, service, and
price.
For more information on DIGITAL Servers for
Windows NT, visit our web site at
http://www.windows.digital.com/ or contact your
local DIGITAL sales representative. Please send
questions and comments about the information
presented in this Performance Flash to Internet
address: csgperf@zko.dec.com.
DIGITAL believes the information in this
publication is accurate as of its publication
date; such information is subject to change
without notice. DIGITAL is not responsible for
any inadvertent errors. DIGITAL conducts its
business in a manner that conserves the
environment and protects the safety and health of
its employees, customers, and the community.
DIGITAL, the DIGITAL logo, the
AlphaPowered logo, DIGITAL Servers and
AlphaServer are trademarks of Digital
Equipment Corporation.
The Intel Inside logo is a registered
trademark of Intel Corporation.
AIM is a trademark of AIM Technology, Inc.
NetBench, ServerBench, and WebBench are
trademarks of Ziff-Davis Inc.
SPEC, SPECint95, SPECfp95, SPECrate_int95,
and SPECrate_fp95 are trademarks of the
Standard Performance Evaluation Corporation.
TPC-C and TPC-D Benchmarks and tpm-C are
trademarks of the Transaction Processing
Performance Council.
Windows NT and Exchange are trademarks of
Microsoft, Inc.
Lotus and Domino are trademarks of Lotus
Development Corporation.
UNIX is a registered trademark in the
United States and other countries,
exclusively licensed through X/Open Company
Ltd.
SCO UnixWare is a trademark of the Santa
Cruz Operation, Inc.
Copyright 1998 Digital Equipment
Corporation. All rights reserved.
|