G E E K P A G E Issue 2.02 - February 1996

Petaflops Computing

By Jarrett S. Cohen

Last summer, the United States Department of Energy and the Intel Corporation announced a US$46 million agreement to build the world's first teraflops computer, able to crunch 1 trillion floating-point operations per second. While six times faster than existing machines, a teraflops computer is still not powerful enough to adequately model important problems like Earth's climate patterns or the AIDS virus, observers say. For such applications, 1,000 times more processing power is necessary - a petaflops.
A petaflops is equal to 10 times the speed of all the networked computers in the US combined. Sound far-fetched? Scientists and engineers in a NASA-led "constructive lunatic fringe group," as participant Seymour Cray calls it, are already planning computers with this capability. No one is sure exactly how these machines will work, but they will require fundamental changes in the way computers are designed.
Intel's system reaches a teraflops by linking more than 9,000 Pentium Pro chips. A continuation of this semiconductor-based massively parallel approach is one option for reaching a petaflops. Projections show that by 2020, 40,000 chips working together would perform at that level. But this scheme would require 10 megawatts of continuous power, enough to supply a small town! The result? Instantaneous meltdown.
A way around this dilemma is to merge processing and memory onto the same piece of silicon. Processor-in-memory (PIM) chips improve performance by reducing the signal delays inherent in fetching data from memory. When you're operating at high speeds, distance really matters: light travels only one foot in a nanosecond. Approximately 10,000 future-generation PIM chips could achieve a petaflops using 12 kilowatts of electricity for processing - a more efficient performance, but still too much heat.
A more exotic alternative that still encompasses low power would be very-high-speed logic built out of superconductors. Cooled in liquid helium to 4.2 degrees Kelvin (-452 degrees Fahrenheit), cryogenic superconductors lose their electrical resistance. That means low-power operation and no fear of overheating. Superconductor logic chips are also easy to manufacture. They are usually composed of just a few layers of the superconducting element niobium, one layer of a resistive film, and two to three layers of insulation.
Of course, the idea of using superconducting chips in computers isn't new. Projects sponsored by IBM and the Japanese government during the 1970s and 1980s developed the chips by using direct current low- and high-voltage levels to represent binary information in a method that resembled the operation of semiconductors. The Japanese managed to clock a frequency of 1 GHz, but today's fastest semiconductor chips are already reaching 300 MHz. This small speedup, therefore, does not warrant the cost of cryogenic cooling.
Konstantin Likharev, professor of physics at the State University of New York at Stony Brook, says these earlier efforts were unsuccessful because researchers tried to force superconductors to work like semiconductors. He and a group of SUNY colleagues tread a different path - rapid single-flux-quantum (RSFQ) circuits.
These circuits take advantage of the natural interaction between superconductors and magnetism. The magnetic field lines passing through a ring of superconducting material are quantized, or restricted to a finite number of values. These "flux quanta," so named because physicists characterize the field in terms of magnetic flux, can represent the 1s and 0s a computer needs to operate.
The information passes from one ring to another through Josephson junctions, aluminum-oxide tunnel barriers that separate the superconducting materials. As one flux-quantum enters or leaves a ring, a picosecond (one-trillionth of a second) pulse is generated across a junction. By passing through a converter that translates it into a voltage signal, this single-flux-quantum pulse switches subsequent circuits.
With these techniques, the signals are propagated between the logic gates at the speed of light. RSFQ chips now operate at 30 GHz, and theoretically, 300 GHz is possible. Rates like this translate into chips 100 times faster than their semiconductor counterparts. By 2015, a petaflops computer should be attainable using 10,000 superconductor chips, each performing at 100 gigaflops.
Since superconductor circuits can work with millivolt signals, extending today's technology would result in a single chip consuming only 0.1 watts. Taking into account the extra energy required for helium cooling, advances should result in a 10,000-chip computer that consumes a total of 100 kilowatts.
Petaflops computers most likely will use some combination of these future technologies. Thomas Sterling, senior scientist at the NASA/Goddard Space Flight Center, believes a hybrid architecture will provide the best overall price performance. Semiconductors would supply random-access memory, while optics would provide mass storage and fast communication between cryogenic logic and room-temperature memory.
Leveraging mass production where possible, Sterling believes a petaflops computer will cost between $100 and $200 million. With the potential for widespread socioeconomic benefits, an idea that seems lunatic today begins to look like the mainstay of the early 21st century.
For more information, see cesdis.gsfc. nasa.gov/petaflops/peta.html.
Jarrett S. Cohen (jarrett.cohen @gsfc.nasa.gov) is a science writer with Hughes STX Corp.