ASCR Monthly Computing News Report - January 2010
In this issue...
PNNL Computational Science Programming Model Crosses the Petaflop Barrier
Researchers at Pacific Northwest National Laboratory (PNNL) and Oak Ridge National Laboratory (ORNL) demonstrated that the PNNL-developed Global Arrays computational programming model can perform at the petascale level. The demonstration performed at 1.3 petaflops using over 200,000 processors. This represents about 50 percent of the processors’ peak theoretical capacity. Global Arrays is one of only two programming models that has achieved this level of performance.
Global Arrays enables researchers to more efficiently access global data by allowing data stored in multiple computer nodes to be accessed directly, rather than requiring coordination between the sender and receiver. This also helps researchers to run bigger models and simulate larger systems, resulting in a better understanding of the data and processes being evaluated. See the Global Arrays Toolkit website
ORNL Supercomputer Shines Light on Thermoelectric Material
A research team led by General Motors physicist Jihui Yang has used the Oak Ridge Leadership Computing Facility's (OLCF's) Jaguar supercomputer to nail down the arrangement of atoms within a promising thermoelectric material. The material, known by the acronym LAST, is a mixture of lead and tellurium speckled with small clumps of silver and antimony atoms. These clumps - known as nanoprecipitates - subtly alter the flow of electrons and phonons (units of vibrational energy) through the material, allowing it to convert heat energy directly into electricity. Thermoelectric materials promise to boost vehicle fuel efficiency by converting the waste heat otherwise lost through a tailpipe into electricity.
The team conducted an unprecedentedly large simulation of a system containing more than 1,700 atoms, calculating the structure of the material directly from the principles of quantum mechanics. It discovered that silver atoms in the nanoprecipitates do not replace lead atoms, as previously assumed, but rather sit between atomic positions. Transmission electron microscopy data from Brookhaven National Laboratory validated the results. The team reported its findings in the October 2, 2009, issue of the journal Physical Review Letters.
PERI-Related Research at LLNL Will Be Highlighted at the IPDPS Conference
Three Performance Evaluation Research Institute (PERI)-related papers involving Lawrence Livermore National Laboratory (LLNL) researchers Bronis de Supinski and Martin Schulz will be presented at the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS) to be held April 19-23, 2010, in Atlanta, Georgia. IPDPS is an international forum for engineers and scientists from around the world to present their latest research findings in all aspects of parallel computation. The three papers are entitled "Using Focused Regression for Accurate Time-Constrained Scaling of Scientific Applications," "Power-Aware MPI Task Aggregation Prediction for High-End Computing Systems" and "Hybrid MPI/OpenMP Power-Aware Computing," and involve collaborators at the University of Crete, University of Arizona, University of Georgia and Virginia Tech.
Worm’s Eye View: Molecular Worm Algorithm Navigates inside Chemical Labyrinth
James Sethian and Maciej Haranczyk of Lawrence Berkeley National Laboratory’s (LBNL’s) Computational Research Division have developed a “molecular worm” algorithm that makes it easier and faster to simulate the passage of a molecule through the labyrinth of a chemical system. Such progressions are critical to catalysis and other important chemical processes, but current computer simulations of these events are expensive and time-consuming to carry out. Their paper entitled “Navigating molecular worms inside chemical labyrinths” appears in the Proceedings of the National Academy of Sciences.
A key to the success of this new algorithm was its departure from the traditional treatment of molecules as hard spheres with fixed radii. Instead, Haranczyk and Sethian constructed “molecular worms” from blocks connected by flexible links. These molecular worms provide a more realistic depiction of a molecule’s geometry, thereby providing a more accurate picture of how that molecule will navigate through a given chemical labyrinth.
Trilinos Enables Impact for Both ASCR and ASC Applications
The Trilinos toolkit, a major software product supported by both the ASC and ASCR programs at Sandia National Laboratories, is being extended to further strengthen connections between ASCR and ASC through stronger software integration. The Mesquite mesh quality improvement library will soon become a package within Trilinos, increasing the mesh and geometry capabilities of Trilinos. In addition, a new release (v3.2) of the Zoltan dynamic load balancing library was included in Trilinos v10 and it provides improved integration with Trilinos as well as new graph partitioning capabilities. Inclusion of Mesquite and Zoltan in Trilinos allows new algorithmic research, in this case funded by ASCR (and, specifically, the SciDAC ITAPS Center and CSCAPES Institute), to more quickly and easily impact not only ASCR applications but also ASC applications that build on Trilinos. It also allows these libraries to leverage the high-performance software infrastructure from Trilinos to more easily implement new algorithms and test them in large-scale applications.
PNNL’s Moe Khaleel Elected AAAS Fellow
Moe Khaleel, director of the Computational Sciences and Mathematics Division at PNNL and a PNNL Laboratory Fellow, was elected a Fellow of the American Association for the Advancement of Science (AAAS). He was recognized for his leadership in computational engineering, which involves designing and developing computational tools to solve engineering and scientific problems. He focuses on computational models for solid oxide fuel cells and advanced lightweight materials. He also develops methods and computational tools that allow scientists and engineers to build and test fuels cells and their material components, which speeds up the development of energy technologies like fuel cells. Additionally, he created a cost-effective process for forming aluminum sheet materials that are now used to make lightweight vehicles.
As director of PNNL’s Computational Sciences and Mathematics Division, Khaleel leads the effort to provide scientific and technological solutions through the integration of high-performance computing, data-intensive computing, computational sciences, mathematics, scalable data management, and bioinformatics to advance the Laboratory’s mission areas.
Berkeley Lab’s Juan Meza Elected Fellow of AAAS
Juan Meza of LBNL’s Computational Research Division was elected a Fellow of the American Association for the Advancement of Science (AAAS). Meza was cited “for exemplary service to the federal energy laboratories and professional societies in enhancing research and research participation.”
In addition to his role as head of the High Performance Computing Research Department, Meza is heavily involved in outreach to math and science students, particularly those in under-represented groups. He has received numerous awards for his efforts in this area.
LLNL Scientist Greg Bronevetsky Wins an Early Career Research Award
Greg Bronevetsky was awarded an early career research grant in computer science for his proposal entitled “Reliable High Performance Peta- and Exa-Scale Computing.” The overall goal of the project is to improve our understanding of the effect of failures on real computing systems and applications. The proposed approach centers on developing statistical models that describe fault propagation through individual system modules and abstraction layers. These models will be able to predict the effects of any set of system faults, making it possible to develop vulnerability profiles of application and system software that will allow developers to protect them against the most likely or most critical errors. The proposed methodology will be applied in two model systems: (1) the effect of component failures and performance degradations on MPI applications on large HPC systems, and (2) the impact of soft fault-induced data corruptions on numerical applications. Bronevetsky graduated with a computer science Ph.D. from Cornell University in 2007 and was awarded the Lawrence Post-Doctoral Fellowship at LLNL in 2006. He became a staff scientist in the Center for Applied Scientific Computing (CASC) in early 2009.
LBNL’s Kamesh Madduri Wins First SIAG/SC Junior Scientist Prize
Kamesh Madduri, an LBNL Alvarez Fellow working in the Computational Research Division, has been selected as the first winner of the Junior Scientist Prize established by the SIAM Activity Group on Super-computing (SIAG/SC). This prize is awarded to an outstanding junior researcher in the field of algorithms research and development for parallel scientific and engineering computing, for distinguished contributions to the field in the three calendar years prior to the year of the award.
Madduri’s research interests include parallel graph algorithm design for complex network analysis, scientific data analysis query optimization using bitmap indexes, and performance tuning of particle-in-cell simulations on multicore architectures. He is one of the principal developers of SNAP (Small-World Network Analysis and Partitioning), an extensible parallel framework for exploratory analysis and partitioning of large-scale networks. The prize will be awarded at the 2010 SIAM Parallel Processing Conference in Seattle, February 24–26. At the conference, Madduri will give a 15-minute presentation on his research. For more information, please visit:
Sandia’s Bochev Named Editor in Chief of SIAM Journal on Numerical Analysis
Pavel Bochev started a three-year term as editor in chief of SIAM Journal on Numerical Analysis. He succeeded Tom Manteuffel, who edited the journal from May 2006 to the end 2009. SIAM Journal on Numerical Analysis was established in 1964 and is one of the most respected journals in this field, with high impact for all of scientific and engineering computing.
ORNL Computing’s Suzy Tichenor Named to HPCwire’s “People to Watch” List
HPCwire has named Suzy Tichenor of Oak Ridge National Laboratory (ORNL) to its “People to Watch in 2010” list. Tichenor launched an industrial partnerships program in the computing and computational sciences directorate that provides American companies access to the two leadership-class high-performance computing (HPC) centers at Oak Ridge. The list was published in the Jan. 4 issue of HPCwire.
Tichenor’s program provides companies access to the Department of Energy-funded Jaguar, the most powerful supercomputer in the world, and Kraken, the most powerful academic supercomputer, funded by the National Science Foundation through a partnership with the University of Tennessee. Companies are able to use these HPC systems to tackle their toughest computational problems. These are the challenges whose solutions provide breakthrough scientific understanding and competitive opportunity. In the process, companies also advance their abilities to use leadership computing.
LBNL’s John Bell Named Co-Organizer of AMS Von Neumann Symposium
John Bell, head of the Center for Computational Sciences and Engineering at LBNL, and longtime collaborator Alejandro Garcia of San Jose State University have been selected as organizers of the 2011 von Neumann Symposium on Multimodel and Multialgorithm Approaches to Multiscale Problems. The symposium is sponsored by the American Mathematical Society and held every four years. The symposium will bring groups together in four key areas (fluids, solids, earth sciences, and molecular dynamics), and will enable applied mathematicians and scientists to discuss current practices and future research directions in the development of hybrid methodologies for multiscale phenomena. The AMS von Neumann Symposia are made possible by the generous support of a fund established by Dr. and Mrs. Carrol V. Newsom in honor of the memory of John von Neumann.
ESnet Gets a Jump on Implementing DNS Security
The Department of Energy has finished implementing Domain Name System Security Extensions (DNSSEC) to its high-performance Energy Sciences Network (ESnet) using a commercial appliance to digitally sign Domain Name System records and manage cryptographic keys. The signed records were published in December 2009, ahead of a mandate from the U.S. Office of Management and Budget (OMB) requiring government networks outside of the .gov
domain to do so. In August of 2008 the OMB required that all top-level .gov
domains be signed by February 2009, while those immediately under the .gov
domain had until the end of 2009 to implement DNSSEC. Because ESnet (www.es.net
) uses the .net
top-level domains, it was not obliged to sign by the OMB mandate. Nevertheless, ESnet decided to be in compliance should OMB expand its mandate.
Proposals Invited for ALCF’s Early Science Program
The Argonne Leadership Computing Facility (ALCF) is now inviting proposals for time allocations on its next-generation, 10-petaflops IBM Blue Gene system. Allocations through the Early Science Program (ESP) are for preproduction hours (between system installation and full production) beginning in early 2012. More than four billion core hours are available. The early science phase of the project encompasses a period of several months between when the machine is first installed at the ALCF and when the system moves into full production. This period will provide projects with a significant head start for adapting to the new machine and access to substantial computational time. Given that this is a shakedown period, users will need to be ready to diagnose a possibly unstable environment and collaborate with ALCF staff to identify the root causes of problems and help develop better solutions.
Proposals for Argonne’s ESP are due April 29, 2010 and must include a detailed plan for the science to be accomplished and a description of what application development will occur throughout the duration of the award. In addition, each selected project’s home institution must pursue a Non-disclosure Agreement (NDA) with IBM for access to needed information on the next-generation architecture. For more details or to submit your proposal, visit the ALCF website at the following URL:
Fan Upgrade Latest Green Effort at OLCF
A recent fan upgrade will save Oak Ridge National Laboratory’s computing complex $150,000 a year in energy costs, helping the center to operate more efficiently and reduce its carbon footprint. The cooling improvement will allow the laboratory’s Computer Science Building’s (CSB’s) twenty 30-ton air conditioning units to operate at peak efficiency and is just the latest in a series of steps by the laboratory to reduce its energy consumption while maintaining two of the world’s fastest computers.
For example, the CSB was among the first Leadership in Energy and Environmental Design (LEED)-certified computing facilities in the country and has one of the best power usage effectiveness ratings of any large-scale data center. Furthermore, a new cooling system dubbed ECOphlex, for the Cray supercomputers known as Jaguar and Kraken (the world’s fastest and third-fastest supercomputers respectively), allows the laboratory to reduce the amount of chilled water used to cool the systems. Considering the fact that thousands of gallons of water per minute are necessary to keep Jaguar cool, a reduction in the volume of necessary chilled water means a proportionate reduction in the energy used to cool it.
NEOS Hits 2 Million Submissions Milestone
NEOS, the Network-Enabled Optimization System developed by researchers at Argonne National Laboratory in conjunction with Northwestern University, has reached a new milestone — 2 million submissions to its optimization software. NEOS has been used extensively for a variety of applications, including modeling electricity markets, predicting global protein folding, and training artificial neural networks. NEOS users include industry, financial institutions, engineering firms, and research laboratories.
The 2 million milestone also reflects the growing use of the NEOS server by students and faculty in both undergraduate and graduate classes. Argonne researchers, supported by ASCR, continue to add new solvers to the server.
SuperLU Library Downloads Hit a New Record
SuperLU, a general purpose library for the direct solution of large, sparse, non-symmetric systems of linear equations on high performance machines, has hit a new record for downloads. Based on counts of various software between Oct. 1, 2008 and Oct. 31, 2009, SuperLU had 9,983 downloads. Among these downloads, 5,719 were for sequential SuperLU, 2,485 for SuperLU_DIST, and 1,779 for SuperLU_MT.
According to developer Sherry Li of Berkeley Lab’s Scientific Computing Group, SuperLU_MT (shared-memory multithreaded) has relatively more downloads compared to previous years, because multicore systems are widely used now. SuperLU was developed by Li, Jim Demmel of UC Berkeley and CRD, John Gilbert of UC Santa Barbara, Laura Grigori of INRIA in France, and Meiyue Shao of Umea University in Sweden. For more information about SuperLU, see the following link:
OUTREACH & EDUCATION:
ESnet Staff to Share Expertise at Joint Techs Meeting
ESnet projects and expertise will play a central role at the Winter 2010 Joint Techs meeting being held Jan. 31–Feb. 4 in Salt Lake City. The meeting, an international conference of network engineers, is being hosted by the University of Utah. ESnet staff will present an overview of the planned 100 Gbps Advanced Networking Initiative, an update on the network monitoring tool perfSONAR, domain name security, and the network’s deployment of IPv6 SNMP (Simple Network Management Protocol) management as a production application. ESnet head Steve Cotter will also give an overview of ESnet.
Established in 1998, the conference is co-sponsored by the ESnet Site Coordinating Committee (ESCC) and Internet2 and brings together networking experts to share their know-how, present innovative strategies, and work toward common solutions to some of the biggest challenges facing research and education networks. These efforts are dedicated to ensuring that researchers have the networking resources they need to support the large data flows and distributed collaborations that make up today’s science research environment. For more information about the workshop, go to:
Workshop Focuses on Manycore, Accelerator-Based Computing for Science
Online, at conferences and in theory, manycore processors and the use of accelerators such as GPUs and FPGAs are being viewed as the next big revolution in high performance computing (HPC). If they can live up to the potential, these accelerators could someday transform how computational science is performed, providing much more computing power and energy efficiency. And in fact they are already helping to drive significant scientific research projects—not bundled together in large systems, but rather one server at a time. In early December, a group of astronomers, physicists and HPC experts gathered at the SLAC National Accelerator Laboratory near San Francisco to discuss how GPUs and FPGAs are meeting their unique needs. The three-day workshop was co-organized by Lawrence Berkeley National Laboratory, NERSC, SLAC and Stanford’s Kavli Institute for Particle Astrophysics and Cosmology.
The workshop was organized as a part of an ongoing effort to develop infrastructure for enabling physics and astronomy data problems by utilizing these emerging technologies.
OLCF to Host SciApps-10 Workshop
From August 3–6, the Oak Ridge Leadership Computing Facility (OLCF) will host current and potential leadership computational researchers as they share knowledge and best practices on the implementation of a wide range of computationally intensive applications at the SciApps-10 workshop. The theme of the workshop is “Challenges and Opportunities for Scientific Applications: Learning to sustain the Petaflop with eyes on the Exaflop horizon.”
In addition to the near- and medium-term requirements, leading computational scientists will discuss leadership computing allocation programs, high-performance computing architecture, scientific mission goals, software engineering practices, and case studies for nearly a dozen leading-edge applications in climate science, astrophysics, materials and nanoscience, chemistry, biology, and nuclear energy. For more information, see
ALCF Getting Started Workshop Offers Hands-on Help to Users
The Argonne Leadership Computing Facility (ALCF) hosted an INCITE Getting Started Workshop at Argonne National Laboratory for 2010 INCITE awardees on January 27–29, 2010. The workshop provided researchers who are conducting both new and renewed INCITE projects with valuable information and technical details. Nine of 13 new INCITE teams were represented. Topics covered at the workshop included ALCF service offerings, the view from Germantown, the INCITE program, infrastructure at the ALCF, Blue Gene/P architecture and software environment, visualization, and performance evaluation. Throughout the workshop, ALCF staff provided hands-on assistance in porting and tuning applications on the Blue Gene/P. Awardees also had the opportunity to have their questions answered by ALCF staff. The final day of the workshop featured a tour of Argonne’s supercomputing facility, which houses the 557-teraflops Blue/Gene P system.
NERSC, OLCF to Co-Host Cray XT5 Workshop in February
NERSC, along with the Oak Ridge Leadership Computing Facility (OLCF) and the National Institute for Computational Science (NICS) at the University of Tennessee and Oak Ridge National Laboratory, will present a joint Cray XT5 Workshop
February 1–3, 2010, at Sutardja Dai Hall on the UC Berkeley campus.
The workshop is designed to provide an in-depth introduction to using the world’s newest and largest Cray XT5 systems. Representatives and staff from NERSC, OLCF, NICS, Cray, and AMD will explain how to use XT5 systems productively. Katie Antypas and Richard Gerber from NERSC are among the presenters. The workshop is aimed at both new and intermediate users of the Cray XT5 who already have some high performance computing experience — knowledge of Linux, a programming language such as C/C++ or Fortran, and some exposure to parallel programming concepts with the Message Passing Interface (MPI). Hands-on sessions will use Cray XT systems at NERSC, OLCF, and NICS.