Machine learning applications may break once operationalized when distribution shift occurs. To better understand machine learning models in production, Garg et al. propose Average Threshold Confidence – a method that uses softmax probability to predict how a model will perform when labeled source data is available but labeled data from the target domain is not. ATC predicts target domain accuracy as the fraction of target data points that receive a model confidence score above a certain threshold. This threshold is learned from validation source data. The authors evaluate this technique on several datasets and on several types of distribution shift, noting that for some types of distribution shift, it will not work well.