To rigorously evaluate datasets and/or train ML models, data practitioners must select a subset of data that represents the full dataset. To facilitate this task, Jacob Schreiber, Dominik Moritz, Vincent Warmerdam, and Will Fondrie have released apricot. Apricot, which has a similar format to sklearn, applies submodular optimization to select a subset of examples that summarize the dataset. Users can extend the package with their own functions and optimizers and can accelerate computationally intensive parts using numba.