In transfer learning, pre-trained models (on standard datasets) are partially re-trained and adapted to a target dataset or task. Most model developers assume that pre-trained models with a higher original accuracy will perform better when adapted to downstream tasks. However, Salman et al. find that adversarially robust models that may achieve lower accuracy on the original dataset often perform better when used in a transfer learning context. Specifically, they find that ImageNet models trained with a robust optimization objective — one that specifically encourages a model’s invariance to small perturbations of inputs — perform better on a set of 12 downstream classification tasks. They hypothesize that this improved performance is due to superior feature representations learned under adversarial robustness.