The Dashboard: HPC I/O Analysis Made Easy

Huong Luu, Amirhossein Aleyasen, Marianne Winslett, Yasha Mostofi, Kaidong Peng

Analyzing the I/O performance of high-performance computing applications can provide valuable insights for application developers, users, and platform administrators. However, the analysis is difficult and requires parallel I/O expertise few users possess. Analyzing an entire platform's I/O workload is even harder, as it requires large-scale collection, cleaning and exploration of data. To address this problem, we created a web-based dashboard for interactive analysis and visualization of application I/O behavior, based on data collected by a lightweight I/O profiler that can observe all jobs on a platform at low cost. The dashboard's target audience includes application users and developers who are starting to analyze their application's I/O performance; system administrators who want to look into the usage of their storage system and find potential candidate applications for improvement; and parallel I/O experts who want to understand the behavior of an application or set of applications. The dashboard leverages relational database technology, a portable graphing library, and lightweight I/O profiling to provide I/O behavior insights previously only available with great effort.