Models that use target deep neural networks to extract structured information from unstructured data are computationally expensive, but query-specific proxy models require large amounts of training data and new training procedures. Kang et al. develop a method for constructing a “task-agnostic semantic trainable index,” TASTI, which produces embeddings that can be reused for a wide range of queries, thereby obviating the need for query-specific proxies. They design TASTI based on the observation that a target DNN induces a schema, which can be used to answer many query types. Consequently, TASTI generates semantic embeddings for each unstructured data record in such a way that close embeddings have similar attributes in the induced schema. The authors demonstrate that low training error on TASTI guarantees downstream accuracy for a class of queries, and show that their index is 10x less expensive to construct than query-specific proxy models, which it outperforms by up to 24x at query time.