Introducing Exa 2.0 Breakthroughs in our AI research and engineering have enabled us to build both the fastest search API (<350ms) and the highest quality search on the market. Product and technical deep dive below:
Exa's sole mission is to build a perfect search engine. One that always returns exactly the information you need as fast as physically possible, available through a seamless API. Exa 2.0 is a big step toward that goal.
To build Exa 2.0, we first needed to expand our index. We now serve tens of billions of webpages and refresh them every minute. Next, we pretrained and finetuned an embedding model for precise semantic search over that index. Exa 2.0 was trained for over a month on our 144x H200 cluster and uses new embedding architectures we've discovered over the past 6 months. To serve these embeddings at the lowest latency in the world required major updates to our in-house vector database. Some examples are new clustering algorithms, lexical compression, and assembly optimizations. All in Rust of course :)
The first update is Exa Fast. Exa Fast now achieves <350ms e2e P50 latency, 30% lower than the next fastest API. Our customers are using it to power particularly latency-sensitive AI use cases.
Second is Exa Deep. Exa Deep is designed to find the highest quality information possible. It agentically searches, processes, then searches again to do so. Exa Deep tops nearly every benchmark we throw at it.
Search is a very diverse problem space. Benchmarks like SimpleQA and FRAMES are helpful, but miss much of what matters for AI search. Here we show evals on some other benchmarks. We have many more internal ones that we'll open source soon.
359.83K