TensorFlow, Buddy Compression, Intel Optane & More

In this bimonthly feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here.

Evaluating TensorFlow’s performance in HPC applications

TensorFlow – an emerging open-source framework that supports using distributed applications on heterogeneous hardware – is gaining popularity for ML applications. In this paper – written by a team from KTH Royal Institute of Technology, South Park Commons, and Oak Ridge National Laboratory – the authors discuss the viability of TensorFlow for running HPC workloads on supercomputers. They design four traditional benchmark HPC applications and demonstrate that TensorFlow can take full advantage of high-performance networks and accelerators.

Authors: Steven W.D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure and Jeffrey S. Vetter.

Achieving scalability and performance portability in high-performance weather and climate models

This paper, Eike Müller from the University of Bath discusses LFRic – the new high-performance weather and climate model being developed by the UK Met Office to replace the Unified Model. Müller focuses on the “separation of concerns” between scientific code and parallel code, enabled by an application called PSyclone. They provide an overview of the scientific requirements, the design of the software infrastructure, and examples of PSyclone usage.

Author: Eike Müller

Using “buddy compression” to enable greater memory for deep learning and HPC workloads on GPUs

GPUs offer higher memory bandwidth than CPU-only systems, but their memory is relatively small and non-expandable. In this paper, written by a team from the University of Texas at Austin and Nvidia, the authors propose “buddy compression,” a system to increase the effective GPU memory capacity and bandwidth while avoiding the typical pitfalls of memory expansion techniques. They discuss initial results, which show performance comparable to a more powerful system.

Authors: Esha Choukse, Michael Sullivan, Mike O’Connor, Mattan Erez, Jeff Pool, David Nellans and Steve Keckler.

Optimizing the I/O performance of HPC applications with autotuning

Obtaining quality I/O performance for diverse applications on disparate platforms can be a major challenge – not least because of the interdependencies between I/O middleware and hardware. Optimization of these interdependencies is obfuscated by endless combinations of parameters. In this paper, researchers from the University of Illinois and Lawrence Berkeley National Laboratory present an autotuning solution that can optimize I/O performance, demonstrating its benefits across several HPC platforms.

Authors: Babak Behzad, Surendra Byna, Prabhat and Marc Snir.

Developing a framework for lifecycle enrichment of HPC applications

As the exascale era draws nigh, researchers are faced with the daunting task of porting their existing applications and algorithms to new architectures and programming languages. In this dissertation, written by Karan Sapra of Clemson University, the author focuses on enriching the lifecycle of applications by providing an application to optimize architecture mapping and a framework to assist in optimizing heterogeneous environments.

Author: Karan Sapra

Assessing the impact of soft errors on large-scale FPGA cloud computing

FPGAs are often used to accelerate HPC applications — but they’re susceptible to “soft errors,” which can lead to silent data corruption or system instability. In this paper, written by researchers from the NSF Center for Space, High-Performance, and Resilient Computing, the authors investigate the failure rate of a handful of FPGA applications by performing fault injection experiments. They conclude that soft-error detection and mitigation may be necessary in certain systems.

Authors: Andrew M. Keller and Michael J. Wirthlin

Measuring the performance of the Intel Optane DC Persistent Memory Module

This paper, written by a team of University of California San Diego computer scientists, discusses the first comprehensive evaluation of Intel’s new Optane DC Persistent Memory Modules. The authors find that the Optane DIMMs can make key storage applications 17 times faster and that they can significantly expand main memory capacity without sacrificing significant performance.

Authors: Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R. Dulloor, Jishen Zhao and Steven Swanson.

Do you know about research that should be included in next month’s list? If so, send us an email at [email protected]. We look forward to hearing from you.

(Excerpt) Read more Here | 2019-03-20 22:44:21
Image credit: source


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.