Model developers must carefully choose the right large-scale stochastic optimization method to accelerate training and achieve high performance of the learned model. However, there are currently hundreds of methods from which they may select – each with their own set of tunable hyperparameters. To facilitate this task, Schmidt et al. present a large-scale benchmark of optimizers wherein they evaluate 15 popular optimization methods on 8 deep learning tasks with 4 different schedules. They find that the performance of the optimizer depends heavily on the problem, although some optimizers have more consistent performance than others.