R. Sambharya and B. Stellato, Data-Driven Performance Guarantees for Classical and Learned Optimizers, arXiv e-prints:2404.13831,2024. (Python code)
We introduce a data-driven approach to analyze the performance of continuous optimization algorithms using generalization guarantees from statistical learning theory. We study classical and learned optimizers to solve families of parametric optimization problems. We build generalization guarantees for classical optimizers, using a sample convergence bound, and for learned optimizers, using the Probably Approximately Correct (PAC)-Bayes framework. To train learned optimizers, we use a gradient-based algorithm to directly minimize the PAC-Bayes upper bound. Numerical experiments in signal processing, control, and meta-learning showcase the ability of our framework to provide strong generalization guarantees for both classical and learned optimizers given a fixed budget of iterations. For classical optimizers, our bounds are much tighter than those that worst-case guarantees provide. For learned optimizers, our bounds outperform the empirical outcomes observed in their non-learned counterparts.
Contributions:
- We provide probabilistic guarantees for classical optimizers in two steps: first, we run the optimizer for a given number of iterations on each problem instance in a given dataset; then, we apply a sample convergence bound by solving a one-dimensional convex optimization problem.
- We construct generalization bounds for learned optimizers using PAC-Bayes theory. In addition, we develop a framework to learn optimizers by directly minimizing the PAC-Bayes bounds using gradient-based methods. After training, we calibrate the PAC-Bayes bounds by sampling the weights of the learned optimizer, and subsequently running the optimizer for a fixed number of steps for each problem instance in a given dataset. Then, we compute the PAC-Bayes bounds via solving two one-dimensional convex optimization problems.
- We apply our method to compute guarantees for classical optimizers on several examples including image deblurring, robust Kalman filtering, and quadcopter control, significantly outperforming bounds from worst-case analyses. We also showcase our generalization guarantees for several learned optimizers: LISTA (Gregor and LeCun, 2010) and its variants (Liu et al., 2019; Wu et al., 2020), learned warm starts (Sambharya et al., 2023a), and model-agnostic meta-learning (Finn et al., 2017). Our generalization guarantees accurately represent the benefits of learning by outperforming the empirical performance observed in their non-learned counterparts.
沒有留言:
張貼留言