Guardocs - Gridtech: Super Computing

Showing posts with label Super Computing. Show all posts

Wednesday, June 18, 2008

One quadrillion floating point operations per second Supercomputer, “Roadrunner,” Tops 31st TOP 500 SuperComputer List

MANNHEIM, Germany; BERKELEY, Calif. & KNOXVILLE, Tenn.—With the publication of the latest edition of the TOP500 list of the world’s most powerful supercomputers today (Wednesday, June 18), the global high performance computing community has officially entered a new realm—a supercomputer with a peak performance of more than 1 petaflop/s (one quadrillion floating point operations per second).

The new No. 1 system, built by IBM for the U.S. Department of Energy’s Los Alamos National Laboratory and and named “Roadrunner,” by LANL after the state bird of New Mexico achieved performance of 1.026 petaflop/s—becoming the first supercomputer ever to reach this milestone. At the same time, Roadrunner is also one of the most energy efficient systems on the TOP500.

The 31st edition of the TOP500 list was released at the International Supercomputing Conference in Dresden, Germany. Since 1993, the list has been produced twice a year and is the most extensive survey of trends and changes in the global supercomputing arena.

“Over the past few months, there were a number of rumors going around about whether Roadrunner would be ready in time to make the list, as well as whether other high-profile systems would submit performance numbers,” said Erich Strohmaier, a computer scientist at Lawrence Berkeley National Laboratory and a founding editor of the TOP500 list. “So, as the reports came in during recent weeks, it’s been both exciting and challenging to compile this edition.”

The Roadrunner system is based on the IBM QS22 blades which are built with advanced versions of the processor in the Sony PlayStation 3, displaces the reigning IBM BlueGene/L system at DOE’s Lawrence Livermore National Laboratory. Blue Gene/L, with a performance of 478.2 teraflop/s (trillions of floating point operations per second) is now ranked No. 2 after holding the top position since November 2004.

Rounding out the top five positions, all of which are in the U.S., are the new IBM BlueGene/P (450.3 teraflop/s) at DOE’s Argonne National Laboratory, the new Sun SunBlade x6420 “Ranger” system (326 teraflop/s) at the Texas Advanced Computing Center at the University of Texas – Austin, and the upgraded Cray XT4 “Jaguar” (205 teraflop/s) at DOE’s Oak Ridge National Laboratory.

Among all systems, Intel continues to power an increasing number, with Intel processors now found in 75 percent of the TOP500 supercomputers, up from 70.8 percent of the 30th list released last November.

Other highlights from the latest list include:

Quad-core processor based systems have taken over the TOP500 quite rapidly. Already 283 systems are using them. Two hundred three systems are using dual-core processors, only eleven systems still use single core processors, and three systems use IBMs advanced Sony PlayStation 3 processor with 9 cores.
The top industrial customer, at No. 10, is the French oil company: Total Exploration Production.
IBM held on to its lead in systems with 210 systems (42 percent) over Hewlett Packard with 183 systems (36.6 percent). IBM had 232 systems (46.4 percent) six months ago, compared to HP with 166 systems (33.2 percent).
IBM remains the clear leader in the TOP500 list in performance with 48 percent of installed total performance (up from 45), compared to HP with 22.4 percent (down from 23.9). In the system category Dell, SGI and Cray follow with 5.4 percent, 4.4 percent and 3.2 percent respectively.
The last system on the list would have been listed at position 200 in the previous TOP500 just six months ago. This is the largest turnover rate in the 16-year history of the TOP500 project.

For the first time, the TOP500 list will also provide energy efficiency calculations for many of the computing systems and will continue tracking them in consistent manner.

Most energy efficient supercomputers are based on
- IBM QS22 Cell processor blades (up to 488 Mflop/s/Watt),
- IBM BlueGene/P systems (up to 376 Mflop/s/Watt)
Intel Harpertown quad-core blades are catching up fast:
- IBM BladeCenter HS21with low-power processors (up to 265 Mflop/s/Watt)
- SGI Altix ICE 8200EX Xeon quad-core nodes, (up to 240 Mflop/s/Watt) ,
- Hewlett-Packard Cluster Platform 3000 BL2x220 with double density blades (up to 227 Mflop/s/Watt)
These systems are already ahead of BlueGene/L (up to 210 Mflop/s/Watt).

Rounding out the Top 10 systems are:

The No. 6 system is the top system outside the U.S., installed in Germany at the Forschungszentrum Juelich (FZJ). It is an IBM BlueGene/P system and was measured at 180 Tflop/s.
The No. 7 system is installed at a new center, the New Mexico Computing Applications Center (NMCAC) in Rio Rancho, NM. It is built by SGI and based on the Altix ICE 8200 model. It was measured at 133.2 Tflop/s.
For the second time, India placed a system in the top10. The Computational Research Laboratories, a wholly owned subsidiary of Tata Sons Ltd. in Pune, India, installed a Hewlett-Packard Cluster Platform 3000 BL460c system. They integrated this system with their own innovative routing technology and achieved a performance of 132.8 Tflop/s which was sufficient for No. 8.
The No. 9 system is a new BlueGene/P system installed at the Institut du Développement et des Ressources en Informatique Scientifique (IDRIS) in France, which was measured at 112.5 Tflop/s.
The last new system in the TOP10 – at No. 10 – is also an SGI Altix ICE 8200 system. It is the biggest system installed at an industrial customer, Total Exploration Production. It was ranked based on a Linpack performance of 106.1 Tflop/s.

The U.S. is clearly the leading consumer of HPC systems with 257 of the 500 systems. The European share (184 systems – up from 149) is still rising and is again larger then the Asian share (48 – down from 58 systems).

Dominant countries in Asia are Japan with 22 systems (up from 20), China with 12 systems (up from 10), India with 6 systems (down from 9), and Taiwan with 3 (down from 11).

In Europe, UK remains the No. 1 with 53 systems (48 six months ago). Germany improved but is still in the No. 2 spot with 46 systems (31 six months ago).

The TOP500 list is compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of NERSC/Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville.

Thursday, May 08, 2008

Climate Computer To Consume less Than 4 Megawatts Of Power And Achieve A Peak Performance Of 200 Petaflops.

BERKELEY, Calif. — Three researchers from the U.S. Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab) have proposed an innovative way to improve global climate change predictions by using a supercomputer with low-power embedded microprocessors, an approach that would overcome limitations posed by today’s conventional supercomputers.

In a paper published in the May issue of the International Journal of High Performance Computing Applications, Michael Wehner and Lenny Oliker of Berkeley Lab’s Computational Research Division, and John Shalf of the National Energy Research Scientific Computing Center (NERSC) lay out the benefit of a new class of supercomputers for modeling climate conditions and understanding climate change. Using the embedded microprocessor technology used in cell phones, iPods, toaster ovens and most other modern day electronic conveniences, they propose designing a cost-effective machine for running these models and improving climate predictions.

In April, Berkeley Lab signed a collaboration agreement with Tensilica®, Inc. to explore such new design concepts for energy-efficient high-performance scientific computer systems. The joint effort is focused on novel processor and systems architectures using large numbers of small processor cores, connected together with optimized links, and tuned to the requirements of highly-parallel applications such as climate modeling.

Understanding how human activity is changing global climate is one of the great scientific challenges of our time. Scientists have tackled this issue by developing climate models that use the historical data of factors that shape the earth’s climate, such as rainfall, hurricanes, sea surface temperatures and carbon dioxide in the atmosphere. One of the greatest challenges in creating these models, however, is to develop accurate cloud simulations.

Although cloud systems have been included in climate models in the past, they lack the details that could improve the accuracy of climate predictions. Wehner, Oliker and Shalf set out to establish a practical estimate for building a supercomputer capable of creating climate models at 1-kilometer (km) scale. A cloud system model at the 1-km scale would provide rich details that are not available from existing models.

To develop a 1-km cloud model, scientists would need a supercomputer that is 1,000 times more powerful than what is available today, the researchers say. But building a supercomputer powerful enough to tackle this problem is a huge challenge.

Historically, supercomputer makers build larger and more powerful systems by increasing the number of conventional microprocessors — usually the same kinds of microprocessors used to build personal computers. Although feasible for building computers large enough to solve many scientific problems, using this approach to build a system capable of modeling clouds at a 1-km scale would cost about $1 billion. The system also would require 200 megawatts of electricity to operate, enough energy to power a small city of 100,000 residents.

In their paper, “Towards Ultra-High Resolution models of Climate and Weather,” the researchers present a radical alternative that would cost less to build and require less electricity to operate. They conclude that a supercomputer using about 20 million embedded microprocessors would deliver the results and cost $75 million to construct. This “climate computer” would consume less than 4 megawatts of power and achieve a peak performance of 200 petaflops.

“Without such a paradigm shift, power will ultimately limit the scale and performance of future supercomputing systems, and therefore fail to meet the demanding computational needs of important scientific challenges like the climate modeling,” Shalf said.

The researchers arrive at their findings by extrapolating performance data from the Community Atmospheric Model (CAM). CAM, developed at the National Center for Atmospheric Research in Boulder, Colorado, is a series of global atmosphere models commonly used by weather and climate researchers.

The “climate computer” is not merely a concept. Wehner, Oliker and Shalf, along with researchers from UC Berkeley, are working with scientists from Colorado State University to build a prototype system in order to run a new global atmospheric model developed at Colorado State.

“What we have demonstrated is that in the exascale computing regime, it makes more sense to target machine design for specific applications,” Wehner said. “It will be impractical from a cost and power perspective to build general-purpose machines like today’s supercomputers.”

Under the agreement with Tensilica, the team will use Tensilica’s Xtensa LX extensible processor cores as the basic building blocks in a massively parallel system design. Each processor will dissipate a few hundred milliwatts of power, yet deliver billions of floating point operations per second and be programmable using standard programming languages and tools. This equates to an order-of-magnitude improvement in floating point operations per watt, compared to conventional desktop and server processor chips. The small size and low power of these processors allows tight integration at the chip, board and rack level and scaling to millions of processors within a power budget of a few megawatts.

Berkeley Lab is a U.S. Department of Energy national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California. Visit our Website at www.lbl.gov.

Monday, October 29, 2007

New World's Most Powerful Vector Supercomputer From NEC, SX-9

Fujitsu and Hitachi in Japan and IBM, Cray in US were the teraflop giants and always competed with each other to be the most powerful computer manufacturing company.
So keeping up with the tradition, NEC Japan, on Thursday announced the launch of what it called the world's most powerful supercomputer on the market, SX-9.
SX-9, the fastest vector supercomputer with a peak processing performance of 839 TFLOPS(*1). The SX-9 features the world's first CPU capable of a peak vector performance of 102.4 GFLOPS(*2) per single core.

In addition to the newly developed CPU, the SX-9 combines large-scale shared memory of up to 1TB and ultra high-speed interconnects achieving speeds up to 128GB/second. Through these enhanced features, the SX-9 closes in on the PFLOPS(*3) range by realizing a processing performance of 839 TFLOPS. The SX-9 also achieves an approximate three-quarter reduction in space and power consumption over conventional models. This was achieved by applying advanced LSI design and high-density packaging technology.

In comparison to scalar parallel servers(*5) incorporating multiple general-purpose CPUs, the vector supercomputer(*4) offers superior operating performance for high-speed scientific computation and ultra high-speed processing of large-volume data. The enhanced effectiveness of the new product will be clearly demonstrated in fields such as weather forecasting, fluid dynamics and environmental simulation, as well as simulations for as-yet-unknown materials in nanotechnology and polymeric design. NEC has already sold more than 1,000 units of the SX series worldwide to organizations within these scientific fields.

The SX-9 is loaded with "SUPER-UX," basic software compliant with the UNIX System V operating system that can extract maximum performance from the SX series. SUPER-UX is equipped with flexible functions that can deliver more effective operational management compatible with large-scale multiple node systems.
The use of powerful compiler library groups and program development support functions to maximize SX performance makes the SX-9 a developer-friendly system. Application assets developed by users can also be integrated without modification, enabling full leverage of the ultra high-speed computing performance of the SX-9.

"The SX-9 has been developed to meet the need for ultra-fast simulations of advanced and complex large-capacity scientific computing," Yoshikazu Maruyama, senior vice president of NEC Corp., said in a statement.
NEC's supercomputers are used in fields including advanced weather forecasting, aerospace and in large research institutes and companies. The SX-9 will first go on display at a supercomputing convention next month in Reno, Nevada.

Specifications
	Multi-node	Single-node
	2 - 512 nodes*3	1 node
	SX-9	SX-9/A	SX-9/B
Central Processing Unit (CPU)
Number of CPUs	32 - 8,192	8-16	4-8
Logical Peak Performance*1	3.8T - 969.9TFLOPS	947.2G - 1,894.4GFLOPS	473.6G - 947.2GFLOPS
Peak Vector Performance*2	3.3T - 838.9TFLOPS	819.2G - 1,638.4GFLOPS	409.6G - 819.2GFLOPS
Main Memory Unit (MMU)
Memory Architecture	Shared and distributed memory	Shared memory
Capacity	1T - 512TB	512GB、1TB	256GB,512GB
Peak Data Transfer Rate	2048TB/s	4TB/s	2TB/s
Internode Crossbar Switch (IXS)
Peak Data Transfer Rate	128GB/s×2 bidirectional (per node)	-

(1) *TFLOPS:: one trillion floating point operations per second
(2) *GFLOPS:: one billion floating point operations per second
(3) *PFLOPS:: one quadrillion floating point operations per second
(4) Vector supercomputer:: A supercomputer with high-speed processor(s) called "vector processor(s)" that is used for scientific/technical computation. Vector supercomputers deliver high performance in complex, large-scale computation, such as climates, aeronautics / space, environmental simulations, fluid dynamics, through the processing of array-handling with a single vector instruction.
(5) Scalar parallel supercomputer:: A supercomputer with multiple general purpose processors suitable for simultaneous processing of multiple workloads such as genomic analysis or easily paralleled computations like particle computation. They deliver high performance by connecting many processors (also used for business applications) in parallel.

Saturday, August 11, 2007

Federal Government (NSB) approves Supercomputers!

If not anything else, this would push USA in to the forefront of Super Computing Power in the world. Perhaps the good old days of real American Power will return, once the project(s) are completed.
The U.S. National Science Board has authorized funding for two of the world's most powerful supercomputers, one of them capable of petaflop-speed operations.

The Times reported that documents inadvertently published on NSF's Web site identified IBM as the leading candidate to build a supercomputer called Blue Waters, which would be about 500 times more powerful than most current supercomputers. Blue Waters is expected to go live in 2011, and the National Science Board's decision Wednesday approves funding of US$208 million over four and a half years.

Blue Waters is expected to be able to make arithmetic calculations at a sustained rate in excess of 1,000-trillion operations per second, or a petaflop per second.

The National Science Board, which oversees NSF policies, also approved funding for a second, smaller supercomputer, intended to bridge a gap between current high-performance computers and more advanced petascale systems under development. The $65 million, five-year project will be located at the University of Tennessee at Knoxville Joint Institute for Computational Science.

It would have a peak performance of just under one petaflop, almost four times the capacity of the current NSF-supported Teragrid, the world's largest and most powerful distributed computing system for open scientific research. The Teragrid currently supports more than 1,000 projects and more than 4,000 U.S. researchers.

IDG News via Networkworld