Monday, October 29, 2007

New World's Most Powerful Vector Supercomputer From NEC, SX-9


Fujitsu and Hitachi in Japan and IBM, Cray in US were the teraflop giants and always competed with each other to be the most powerful computer manufacturing company.
So keeping up with the tradition, NEC Japan, on Thursday announced the launch of what it called the world's most powerful supercomputer on the market, SX-9.
SX-9, the fastest vector supercomputer with a peak processing performance of 839 TFLOPS(*1). The SX-9 features the world's first CPU capable of a peak vector performance of 102.4 GFLOPS(*2) per single core.

In addition to the newly developed CPU, the SX-9 combines large-scale shared memory of up to 1TB and ultra high-speed interconnects achieving speeds up to 128GB/second. Through these enhanced features, the SX-9 closes in on the PFLOPS(*3) range by realizing a processing performance of 839 TFLOPS. The SX-9 also achieves an approximate three-quarter reduction in space and power consumption over conventional models. This was achieved by applying advanced LSI design and high-density packaging technology.

In comparison to scalar parallel servers(*5) incorporating multiple general-purpose CPUs, the vector supercomputer(*4) offers superior operating performance for high-speed scientific computation and ultra high-speed processing of large-volume data. The enhanced effectiveness of the new product will be clearly demonstrated in fields such as weather forecasting, fluid dynamics and environmental simulation, as well as simulations for as-yet-unknown materials in nanotechnology and polymeric design. NEC has already sold more than 1,000 units of the SX series worldwide to organizations within these scientific fields.

The SX-9 is loaded with "SUPER-UX," basic software compliant with the UNIX System V operating system that can extract maximum performance from the SX series. SUPER-UX is equipped with flexible functions that can deliver more effective operational management compatible with large-scale multiple node systems.
The use of powerful compiler library groups and program development support functions to maximize SX performance makes the SX-9 a developer-friendly system. Application assets developed by users can also be integrated without modification, enabling full leverage of the ultra high-speed computing performance of the SX-9.

"The SX-9 has been developed to meet the need for ultra-fast simulations of advanced and complex large-capacity scientific computing," Yoshikazu Maruyama, senior vice president of NEC Corp., said in a statement.
NEC's supercomputers are used in fields including advanced weather forecasting, aerospace and in large research institutes and companies. The SX-9 will first go on display at a supercomputing convention next month in Reno, Nevada.

Specifications

Multi-node Single-node
2 - 512 nodes*3 1 node
SX-9 SX-9/A SX-9/B
Central Processing Unit (CPU)
Number of CPUs 32 - 8,192 8-16 4-8
Logical Peak Performance*1 3.8T - 969.9TFLOPS 947.2G - 1,894.4GFLOPS 473.6G - 947.2GFLOPS
Peak Vector Performance*2 3.3T - 838.9TFLOPS 819.2G - 1,638.4GFLOPS 409.6G - 819.2GFLOPS
Main Memory Unit (MMU)
Memory Architecture Shared and distributed memory Shared memory
Capacity 1T - 512TB 512GB、1TB 256GB,512GB
Peak Data Transfer Rate 2048TB/s 4TB/s 2TB/s
Internode Crossbar Switch (IXS)
Peak Data Transfer Rate 128GB/s×2 bidirectional (per node) -


(1) *TFLOPS:
one trillion floating point operations per second
(2) *GFLOPS:
one billion floating point operations per second
(3) *PFLOPS:
one quadrillion floating point operations per second
(4) Vector supercomputer:
A supercomputer with high-speed processor(s) called "vector processor(s)" that is used for scientific/technical computation. Vector supercomputers deliver high performance in complex, large-scale computation, such as climates, aeronautics / space, environmental simulations, fluid dynamics, through the processing of array-handling with a single vector instruction.
(5) Scalar parallel supercomputer:
A supercomputer with multiple general purpose processors suitable for simultaneous processing of multiple workloads such as genomic analysis or easily paralleled computations like particle computation. They deliver high performance by connecting many processors (also used for business applications) in parallel.

No comments: