10:00 - 10:10
PMBS Introduction and Welcome

Steven Wright
University of York, UK

Session 1: Best Papers

10:10 - 10:30
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX [abstract]

Christie Alappat, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein
Erlangen Regional Computing Center, Germany

Nils Meyer, Tilo Wettig
University of Regensburg, Germany

10:30 - 11:00
The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing [abstract]

Tan Nguyen, Samuel Williams, Colin MacLean, Douglas Doerfler, Nicholas J. Wright
Lawrence Berkeley National Laboratory, CA

Marco Siracusa
Politecnico di Milano, Italy

11:00 - 11:10 Break

Session 2: Benchmarking

11:10 - 11:30
Benchmarking Julia’s Communication Performance: Is Julia HPC ready or Full HPC? [abstract]

Sascha Hunold, Sebastian Steiner
TU Wien, Austria

11:30 - 12:00
Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations [abstract]

Hartwig Anzt, Yuhsiang M. Tsai, Terry Cojean
Karlsruhe Institute of Technology, Germany

Ahmad Abdelfattah
University of Tennessee, TN

Jack Dongarra
University of Tennessee, TN
Oak Ridge National Laboratory, TN
University of Manchester, United Kingdom

12:00 - 12:30
Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA [abstract]

Ryusuke Egawa
Tokyo Denki University, Japan

Tsuyoshi Yamashita, Daisuke Sasaki, Hiroyuki Takizawa
Tohoku University, Japan

Souya Fujimoto, Yoko Isobe, Yoichi Shimomura
NEC Corporation, Japan

12:30 - 13:00
Lightweight Measurement and Analysis of HPC Performance Variability [abstract]

Jered Dominguez-Trujillo, Keira Haskins, Soheila Jafari Khouzhani, Christopher Leap, Sahba Tashakkori, Quincy Wofford, Trilce Estrada, Patrick G. Bridges
University of New Mexico, NM

Patrick M. Widener
Sandia National Laboratories, NM

13:00 - 14:30 Lunch Break

Session 3: Performance Portability and Optimization

14:30 - 15:00
Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization [abstract]

Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Valerie Taylor, Paul Hovland
Argonne National Laboratory, IL

Mary Hall
University of Utah, UT

15:00 - 15:30
Warwick Data Store: A Data Structure Abstraction Library [abstract]

Richard O. Kirk, Gihan R. Mudalige
University of Warwick, Coventry, United Kingdom

Martin Nolten, Robert Kevis, Timothy R. Law, Satheesh Maheswaran, Seimon Powell
AWE, United Kingdom

Steven A. Wright
University of York, United Kingdom

Stephen A. Jarvis
University of Birmingham, United Kingdom

15:30 - 16:00
Accelerating High-Order Stencils on GPUs [abstract]

Ryuichi Sai, John Mellor-Crummey, Xiaozhu Meng
Rice University, TX

Mauricio Araya-Polo, Jie Meng
Total E&P Research and Technology, TX

16:00 - 16:15 Break

Session 4: Modelling and Simulation

16:15 - 16:45
Developing Models for the Runtime of Programs With Exponential Runtime Behavior [abstract]

Michael Burger, Giang Nam Nguyen, Christian Bischof
Technical University of Darmstadt, Germany

16:45 - 17:15
Performance Trade-offs in GPU Communication: A Study of Host and Device-initiated Approaches [abstract]

Taylor Groves, Khaled Ibrahim, Lenny Oliker, Nicholas J. Wright, Samuel Williams, Katherine Yelick
Lawrence Berkeley National Laboratory, CA

Ben Brock
University of California, Berkeley, CA

Yuxin Chen
University of California, Davis, CA

17:15 - 17:45
Evaluation of the Communication Motif for a Distributed Eigensolver using the SST Network Simulation Tool [abstract]

Md Afibuzzaman, Hasan Metin Aktulga
Michigan State University, MI

Pieter Maris
Iowa State University, IA

Taylor Groves, Dossay Oryspayev, Brandon Cook, Chao Yang
Lawrence Berkeley National Laboratory, CA

17:45 - 17:50

Simon Hammond
Sandia National Labroatories, NM