In a previous blog post, the Pinterest Engineering team presented Manas, their in-house search engine – which generates a set of candidate entities for search products. Here, they describe how they developed a more flexible embedding-based retrieval framework to more easily integrate new approaches to approximate nearest neighbor search (like HNSW). They discuss how their system has enabled them to serve HNSW in real-time and at scale. Their approach includes a HNSW index structured as a multi-layered sparse graph and smarter graph compaction strategies. The authors also introduce an online recall tool that they created to validate search quality.