Search developers tend to focus most of their efforts on the first page of results. As a result, they prioritize investment in ranking models, with the goal of improving quality and business metrics, such as relevance and conversion.

Precision and Recall

In information retrieval terms, this focus on the first page corresponds to an emphasis on precision, the fraction of results that are relevant. To be more precise — no pun intended — it corresponds to an emphasis on position-biased precision measures like discounted cumulative gain (DCG).

But precision isn’t the only measure of search quality. There’s also recall, which measures the fraction of potentially good results that are retrieved. Recall is about “the whole truth”, while precision is about “nothing but truth”.

There’s a tradeoff between precision and recall: efforts to improve one almost always come at the expense of the other. But search developers tend to invest less into recall than into precision, and their investments in recall are often crude. That’s a shame, since recall dramatically affects the search experience.

When Recall Really Matters

There are three search scenarios where recall is especially important:

  • Searches that return no results or only a few results. For these searches, even a small increase in recall can have a critical impact. When the number of results is low, the expected benefit of increasing recall tends to outweigh the expected cost of decreasing precision.

Recall affects more than just the search results. It also affect aggregates, like the total number of results and counts for facet values. These aggregates, which are especially useful for broad queries, can be sensitive to the precision-recall tradeoff for the entire result set.

Improving Recall

There are three main ways to improve recall.

  • Query expansion: a reductionist approach that expands words or phrases using a dictionary. This approach is simple, but it does not respect context (e.g., wine glasses -> wine eyeglasses). It works best in conjunction with an approach that ensures precision, such as query categorization. It’s best to keep the dictionary size manageable, as well as to avoid conflating spelling and stemming word variations with semantic synonyms.

Summary

Precision is important for the happy path where searchers find great results on the first page using the default ranking. But recall matters too, and not just for queries that would otherwise return few results. Recall is critical for non-default sorts, as well as for computing useful aggregates like facet counts. There are at least three ways to improve recall: dictionary-based query expansion, query relaxation, and whole-query expansion. Search developers invest in all three of these approaches, and not just in better ranking.

High-Class Consultant.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store