LEXINGTON, Ky Researchers at the University of Kentucky have built a clustered supercomputer made of groups of PCs that brings the cost of supercomputing to a new low, they claim.
The clusters, known as "Beowulfs," have helped bring down the cost of supercomputing while offering scientific researchers a more hands-on relationship with them.
Called the Kentucky Linux Athlon Testbed 2 (Klat2), the scalable clustering system uses new network concepts and cheaper Ethernet hardware to achieve a cost performance of $650 per gigaflop.
Commercial supercomputers cost around $10,000 per Gflop, which the Beowulf approach has reduced to $3,000 per Gflop. Then even further reductions in cost were achieved by the University of Kentucky team by using 100-Mbit/second Ethernet hardware in a new configuration called a "flat neighborhood" network. Previous clustered designs have relied on high-performance gigabit networking hardware.
The brainchild of Hank Dietz, a professor of electrical engineering at the University of Kentucky, the system evolved out of a project started in 1994 to build computing clusters using Linux and off-the-shelf hardware. The goal is to use new networking technologies to make inexpensive, "personalized turnkey superclusters" that will allow scientists and engineers to economically solve computational problems.
Klat2 uses the 3DNow! 32-bit floating-point vector extensions to Advanced Micro Devices Inc.'s Athlon processor to improve performance. Sixty-four 700-MHz Athlons were used to achieve more than 64 Gflops on a benchmarking problem involving 40,960 simultaneous linear equations. The benchmark results ranked 150th in Jack Dongarra's list of the 500 fastest supercomputers in the world.
The most important cost-saving feature was the concept of a flat-neighborhood network topology. Ideally, every processor in the cluster should be directly connected to every other processor, but that is where the networking costs come in. The researchers eventually realized that something less complete in the way of interconnectivity would do just as well. At the minimum, all that is needed is for each PC in the cluster to share a switch with another PC. Thus, the flat-network topology allows any PC to establish a connection with any other PC in the cluster using only the network interface cards in the PCs.
The concept works in part because the shared-switch arrangement is more parallel than a single serial connection based on network protocols. Designing an effective topology for 64 nodes proved to be a daunting task, however, and the researchers had to use a design system based on genetic algorithms to get a high-performance configuration.