Evaluating Risk and Confidence in Performance Bounds of Configuration Sampling Strategies

doi:10.1145/3808147

DOI: 10.1145/3808147 ISSN: 2994-970X

Evaluating Risk and Confidence in Performance Bounds of Configuration Sampling Strategies

Kallistos Weis, Martina Maggio, Norbert Siegmund, Sven Apel

Modern software usually exposes a large number of configuration options to the user, giving rise to enormous configuration spaces in practice. Appropriate choices for these options dramatically influence the performance of the software (throughput, memory consumption, execution time, etc.). However, due to the sheer size of the configuration space, systematically identifying the worst- or best-performing configurations is computationally infeasible through exhaustive exploration. Instead, practitioners rely on budgeted sampling strategies, such as uniform random sampling or statistical recursive search, to explore the configuration space under fixed measurement budgets in an attempt to find the worst- or best-performing configuration. Even worse, a fundamental limitation of existing sampling strategies is the lack of quantifiable guarantees that the selected configuration truly reflects worst-case (or best-case) performance. In this paper, we define the basic concepts of posterior risk and posterior confidence and present a probabilistic framework to evaluate how well sampling strategies identify the worst- or best-performing configuration of a software system. We evaluate our framework by comparing five representative sampling strategies on seven real-world configurable software systems. We find that statistical recursive search yields consistently tighter best-case guarantees—higher posterior confidence and lower posterior risk—than the alternatives at the same budget. Our results demonstrate the applicability of our framework as a principled basis for reporting, comparing, and refining sampling strategies, and as a tool for practitioners to select strategies and budgets with quantified guarantees across systems and sample sizes.

Outline

Evaluating Risk and Confidence in Performance Bounds of Configuration Sampling Strategies

More from our Archive