Analyzing the AI Search Opportunity

Daniel Tunkelang
5 min readApr 2, 2024

In the past several years, AI has disrupted every part of the technology industry. Search applications are no exception: it is not a question of whether to use AI for search, but where to incorporate it into the stack. In this post, I will try to perform a high-level opportunity analysis of where AI can improve search applications.

If it ain’t broke, don’t fix it.

Before we explore where AI can help search, let us consider what is already working well today using a traditional inverted index architecture. After all, while there is always room for incremental improvement, it is important not to break something that already works reasonably well.

The sweet spot for a traditional index architecture is one where the search application’s content and query representations are already aligned around a robust controlled vocabulary. This tends to be the case in domains where content is already categorized and annotated with highly structured data and searchers consistently use standardized terms that directly map to those annotations. There may still be a need for query understanding to perform query rewriting to segment the query into spans and match those spans to entities, This approach might employ modern machine learning models, but it still works well with a traditional inverted index architecture.

Even when the alignment is close but not perfect, an inverted index tends to perform well, especially in combination with a query understanding pipeline that includes query categorization.

AI can address the vocabulary mismatch problem.

When there is less alignment between the search application’s content and query representations, AI can help. Specifically, AI can help address the vocabulary mismatch problem.

The vocabulary mismatch problem is exactly what it sounds like: the vocabulary used in the index does not always align with the vocabulary searchers use in their queries. The misalignment can hurt retrieval and relevance, and it can be particularly challenging for recall.

Search application developers have traditionally addressed the vocabulary mismatch problem by increasing recall through query expansion and query relaxation. While these methods do increase recall, their loss of holistic query context often leads to a loss of precision.

AI can help preserve this holistic query context. Specifically, using AI to compute query similarity allows us to perform whole-query expansion, which is a more principled way to increase recall and is less vulnerable to context-violating loss of precision than query expansion and relaxation.

AI can allow searchers to express their intent more naturally.

For decades, science fiction has led us to believe that we would interact with machines by talking to them. But when the internet brought us search applications, we found ourselves learning to compress our intents into short noun phrases so that the machines could understand us.

Today, AI offers us the opportunity to unlearn this shorthand. Instead of expecting searchers to learn to communicate in a machine-friendly language, we can use AI-powered search applications to understand searchers in their own language.

In practice, understanding natural language search queries may be as simple as AI-powered query rewriting, especially when the query expresses a simple, common intent. For example, a search application could rewrite “I’m looking for a cable to connect my laptop to my TV” as “hdmi cable”. Not all intents are this simple, but many are. And, if we are going to replace typing with voice as the main way to interact with search applications, we will want searchers to be able to express themselves more naturally.

The other way that AI can help searchers express their intent more naturally is through iterative refinement. Search is often a journey, especially when searchers have complex needs. In today’s search applications, that journey starts with a query but can include a number of additional steps, such as restricting the results to a category, filtering by an aspect, sorting the results, or reformulating the query. Searchers often find these steps tedious and confusing. AI can allow searchers to provide feedback to the search application in natural language, enabling a truly conversational search experience.

AI can enable searchers to bring more signals to the search process.

When queries have a relatively low amount of signal, AI is overkill. For example, if all the search application knows is that I am looking for t-shirts, then the best it can do is to retrieve all the t-shirts in the index and rank them using query-independent factors like popularity, recency, and price.

But what if the search application has more context? Perhaps my previous queries during that search session reveal something useful about the color or style I have in mind — or that I am specifically looking for men’s t-shirts. My location might be helpful too, based on the locally preferred styles or the current weather. Seasonality might come into play too. Or perhaps the search application has signals that it can use for personalization, such as my profile content or my previous search behavior.

It is possible to incorporate all of these signals into a ranking model as factors. But the richness and variety of these signals can be challenging for traditional approaches, putting a burden on search application developers to invest in feature engineering. In contrast, AI-powered approaches are more forgiving when we throw a heterogeneous mix of signals at them. And AI also opens up opportunities around multi-model input, such as combining text with images, voice, or other non-textual signals.

AI can identify recall gaps that make content hard to find.

So far, we have explored ways that AI can improve the search experience through better query understanding, retrieval, and ranking. But AI can also help us measure how well a search application is performing — specifically, how well it is making the indexed content retrievable.

Consider an entry in the search index. We can measure its retrievability by executing a set of search queries that should retrieve the entry and then counting how many of those queries actually retrieve it. For example, a black t-shirt should be retrievable by queries like “black tshirt”, “black tshirts”, “black t shirt”, “tshirts black”, etc.

This strategy is not as simple as it sounds. For a large search index, measuring every entry’s retrievability is prohibitively expensive. We can address this concern by taking a representative sample. A bigger challenge is obtaining a set of search queries that we expect to retrieve a given entry.

To do so, we treat query generation as a search problem, indexing our query log and then retrieving the most relevant queries for a result from that log. We use AI to perform embedding-based retrieval of queries.

Summary: AI-powered search is a great opportunity. But be careful.

It should be clear from the above that AI offers a variety of opportunities to improve search applications, and even to improve how we measure the performance of search applications. But we have to be careful not to harm what is working already. LLMs and RAG are great, but don’t throw away your inverted index. At least not yet.