Conclusions
The push to achieve the largest and most complex scientific discoveries
using high-performance computing requires heroic efforts from
computational scientists, computing system designers, and software
developers. But critically, these tremendous efforts have proven to
successfully flow downstream and make equally important, but less
computationally demanding, scientific discoveries tractable. By design,
a calculation that was entirely heroic a decade ago, can now be achieved
by a handful of highly motivated graduate students. To usher in this
same downstream effect for data-driven science, a set of sustained and
heroic efforts are needed for building and operating storage systems
that can support highly concurrent and low-latency access to massive
volumes of scientific data. With this key underpinning under development
and then in use, we enable the additional efforts needed to extract new
insight and invent new methods for accelerating data-driven scientific
discovery. And in several years, as the benefits of new methods for
analyzing data are realized and made commonplace, small teams of highly
motivated graduate students will perform data-driven searches for
discovery that could not be dreamed of as possible within contemporary
HPC data centers. The road ahead, and its inevitable roadblocks and
detours, will be difficult and surprising, but the rewards at the end of
this journey are too great to
resist.
Acknowledgment
This work was supported by the U.S. Department of Energy, Office of
Science, Advanced Scientific Computing Research, under Contract
DE-AC02-06CH11357.
Author Bios
Bradley Settlemyer is a senior scientist in Los Alamos National Laboratory’s HPC Design group. He received his Ph.D. degree in computer engineering from Clemson University in 2009 with a research focus on the design of parallel file systems. He currently leads the storage systems research efforts within Los Alamos’ Ultrascale Research Center and his team is responsible for designing and deploying state-of-the-art storage systems for enabling scientific discovery. He is the Primary Investigator on projects ranging from ephemeral file system design to archival storage systems using molecular information technology and he has published papers on emerging storage systems, long distance data movement, system modeling, and storage system algorithms. Contact him at
bws@lanl.gov.