14:00 - 14:30
Adaptive Stopping Rule for Performance Measurements

Viyom Mittal, Pedro Bruel, Dejan Milojicic, Eitan Frachtenberg
Hewlett Packard Labs, USA

Performance variability in complex computer systems is a major challenge for accurate benchmarking and performance characterization, especially for tightly-coupled large-scale high-performance computing systems. Point summaries of performance may be both uninformative, if they do not capture the full richness of its behavior, and inaccurate, if they are derived from an inadequate sample set of measurements. Determining the correct sample set—and in particular, its size—requires balancing trade-offs of computation, methodology, and statistical power.

In this paper, we treat the performance distribution as the primary target of the performance evaluation, from which all other metrics can be derived. We propose a meta-heuristic that characterizes the performance distribution as it is being measured, dynamically determining when enough samples have been collected to approximate the true distribution. Compared to predetermined fixed stopping criteria, this dynamic and adaptive method can be more efficient in resource use, since it can stop as early as the desired certainty level is obtained, and more accurate, since it does not stop prematurely. Importantly, it requires no advance knowledge or assumptions about the system under test or its performance characteristics.

We evaluate a prototype of our proposal using a mix of synthetic and real benchmarks. For synthetic distributions, this approach closely matches the true distribution. For actual benchmarks, the heuristic is overly conservative for some applications and overly lax for others, especially those using GPUs. But it still matches the overall shape of the distribution for benchmarks with very diverse distributions, which suggests that it is a viable approach for an adaptive stopping rule.