HP AlphaServer technology helps Commerzbank tolerate disaster on September 11

testing disaster tolerance

While most large organizations today have plans for Disaster Tolerance (DT), few have to put them to the test. The North American headquarters of Commerzbank, located less than 100 yards from the World Trade Center in New York City, put its DT plan into action on September 11, 2001. Because Commerzbank relies on OpenVMS wide-area clustering, volume shadowing and AlphaServer GS160 systems from HP, the bank was able to function on September 11 because its critical banking applications continued to run at the primary site and were available from the bank’s remote site.

Foresight has long been an asset of Commerzbank AG, the parent bank, which was founded in 1870. Frankfurt-based Commerzbank has experienced and helped shape 130 years of German economic history — from empire to monetary union, from the gold mark to the euro. With consolidated total assets of roughly 500 billion euros, Commerzbank is one of Germany’s — and Europe’s — leading banks. The Commerzbank AG Group includes numerous subsidiaries in Germany and in 45 countries around the world. The New York branch, set up in 1971, was the first to be established in the U.S. by a German bank.

Commerzbank, North America is a wholesale bank that serves approximately 500 clients, many of whom are Fortune 500 companies. The bank specializes in areas such as corporate banking, syndications, real estate financing and public financing.

zero tolerance for downtime

According to Werner Boensch, Executive Vice President of Commerzbank, North America, “Our tolerance for downtime is zero.”

That imperative creates a challenge for the bank. Gene Batan, Vice President of the Systems and Information Technology Department of Commerzbank, North America, explains, “My primary concern is to minimize downtime to the point of zero — and ascertain that there is a redundancy of data in several locations. We need to ensure that there is no downtime on any critical production system at any point in time.”

Zero downtime is why the bank has run its critical systems on the OpenVMS operating system since the 1980s. According to Batan, “OpenVMS is the most secure and reliable operating system we have ever experienced.”

Like most enterprises today, Commerzbank has a multi-vendor computing environment. However, the bank runs its most critical banking applications on AlphaServer GS160 platforms. These applications include a money transfer system responsible for the bank’s connection to the Federal Reserve and the New York Clearing House, a trading system, a banking system that handles internal banking requirements, a letter of credit system, a futures and options system, and much more — all running under OpenVMS. The bank uses StorageWorks systems to store an impressive 2 terabytes of data utilizing RAID 0, RAID 1 and RAID 5 technology in a SAN environment. HP fibre switches are utilized to form the SANs.

To ensure constant uptime, the bank has one AlphaServer GS160 system at its primary site in downtown Manhattan and another at its remote site, which is 30 miles away in Rye, New York. Also at the remote site is a pair of AlphaServer 4100 systems. These servers are part of the OpenVMS cluster at the primary site, and their only role is to serve the remote drives to the primary location using Mass Storage Control Program (MSCP), a part of the OpenVMS operating system. The disk drives are either RAID configured or mirrored at local and remote sites, as well as volume shadowed.

Batan says, “Because of OpenVMS wide-area clustering, the storage at our remote site is always available and updated in real time.”

surviving the meltdown

Boensch explains what happened to the bank on September 11. “From a technology point of view, the first thing we lost was our communication link to the Federal Reserve and the New York Clearing House. One of our staff switched the links to our remote site. Since our AlphaServer GS160 system at our primary site was running, we started to receive payment messages from the Federal Reserve Bank of New York and the New York Clearing House.”

Commerzbank is located on floors 31 to 34 at the World Financial Center, which is west of the World Trade Center and across the West Side Highway. When the second jet hit, the bank personnel evacuated the area immediately.

“Our main challenge was to get people from downtown to Rye, because the subways, trains and bridges were closed,” states Boensch. “The bank has a staff of over 400 people, but if we have about 10 people we can run the bank for about two days — that’s how we’re organized. We were able to get 16 people out to Rye that day, and then, as transportation became available, we had more people.”

For the next eight months, approximately two-thirds of the bank’s staff worked in Rye, and the other third worked at a subsidiary in mid-town Manhattan until the primary site was ready for re-occupancy in mid-May 2002.

In addition to having a remote operation site, the bank’s DT plan includes comprehensive protection of its primary site. Boensch explains, “In our primary site, we have our own generator, fuel storage tank, cooling tower, uninterrupted power supply, battery backup system and fire suppression system — as well as extra CPUs and redundant drives. As a result, when the World Trade Center area lost power, our generator and cooling tower kicked in, so none of our systems were down initially. However, dust and debris from the collapse of the World Trade Center towers caused our AC units to fail during the day.”

From the remote site in Rye, the Systems team was able to monitor its primary data center back near Ground Zero. Boensch describes what happened after the staff evacuated the Manhattan site. “Because of the intense heat in our data center, all systems crashed except for our AlphaServer GS160. We lost one partition in this system due to the heat condition, which was 104 degrees in the QBBs (Quad Building Blocks). The other partition kept on running with remote drives only, since the local drives became unavailable as well. OpenVMS wide-area clustering and volume-shadowing technology kept our primary system running off the drives at our remote site 30 miles away.”

Batan describes the OpenVMS Galaxy Software, the technology that allows the bank to run multiple instances of OpenVMS in each of the AlphaServer system’s hard partitions. The OpenVMS instances simultaneously run different applications. “One hard partition failed, but the other — whose OpenVMS instances run the more critical applications — kept on running. With GS Series systems, one hard partition can fail without bringing down the whole system. So while most computers were having difficulties in the data center, OpenVMS Galaxy and the AlphaServer GS160 were so robust that even though one of the hard partitions, housed in the upper two QBBs, crashed due to heat, the other hard partition, housed in the lower two QBBs, kept on running multiple instances of OpenVMS. The money transfer system never went down and we actually remained operational that day.”

Boensch said that all of Commerzbank’s vendors extended their help. “Compaq, now part of the new HP, was there for us, asking if there was anything they could do. They were willing to offer hardware, but because of our remote site, we were OK.”

lessons learned

The bank’s DT environment is part of a larger business continuity plan, which was designed and maintained by the Systems and IT Department. “We not only test our systems and applications on a regular basis involving business users,” explains Boensch, “but we also have a call tree system to make sure that every member of Commerzbank, North America is aware of the process we have to go through in case there is a contingency condition. This plan was in place before September 11.”

According to Boensch, a survey from the Federal Reserve showed that banks that had their own DT sites handled September 11 much better than those that didn’t. “OpenVMS wide-area clustering and remote shadowing, which allow you to locate a remote site up to 500 miles away, greatly reduce your risk in case of a disaster.”

Batan maintains that the OpenVMS AlphaServer platform is the ideal way for Commerzbank to run its critical applications. “OpenVMS, with its clustering, volume shadowing, security and resilience, ensures high availability. The AlphaServer GS160 system, which we divided into two hard partitions and clustered the OpenVMS instances with our remote site 30 miles away, is a very robust machine and the Galaxy software provides the ability to share CPUs among the OpenVMS instances within each hard partition. The combination provides unbeatable tolerance, reliability and redundancy.”

Boensch describes the activities of Commerzbank’s Disaster Recovery (DR) site in non-disaster mode. “Our DR site is really dual purpose. The AlphaServer GS160 system is a standby production site in case of a disaster. But on a regular day-to-day basis, it’s up and running as a test and development system. Actually, the only things that are redundant in an active/active configuration are the StorageWorks data disks — they are truly dedicated both locally and remotely. We also use the site for training.”

As things return to normal for businesses around the World Trade Center area — and indeed around the world — OpenVMS on AlphaServer systems will continue to be the gold standard of availability, reliability and security — backed by people who will be there whenever they are needed.

additional information

For more information on how working with Hewlett-Packard can benefit you, contact your local HP service representative, or visit us through the Internet at our World Wide Web address: http://www.hp.com

Technical information contained in this document is subject to change without notice.

Compaq Computer Corporation is a wholly owned subsidiary of Hewlett-Packard Company. All other names and products are property of their respective companies.

© 2002 Hewlett-Packard Company. All Rights Reserved. Reproduction, adaptation or translation without prior written permission is prohibited, except as allowed under the copyright laws.

July 2002