Standard approaches to training ML models may result in high accuracy on average, but low accuracy on specific groups (i.e. low “worst group performance”). However, existing methods to address this problem (e.g. group DRO) are expensive to implement, since they require additional annotations. To improve worst group performance, Liu et al. introduce JTT, which trains a standard empirical risk minimization model for several epochs before training a second model that up weights misclassified examples from the first model. Unlike group DRO, this approach only requires a small number of group annotations (for hyperparameter tuning).