Thoughts on Search Result Diversity

Regular readers know that search is about much more than ranking. If you’re new here, you might want to read this post about ranking vs. relevance.

But a topic I’ve neglected is search result diversity. I’ll remedy that in this post.

Search result diversity is not for ambiguous queries.

Search result diversity is sometimes — but not always! — useful for broad queries, but never for ambiguous ones. The best way to address an ambiguous query is through a clarification dialogue that establishes the searcher’s intent. A good query — that is, a query that the search engine can robustly map to an unambiguous intent — is a prerequisite for good results.

So search result diversity is a way to address some, but not all, broad queries. We’ll explore which broad queries benefit from result diversity, which aspects to diversify, and how to trade off result diversity against result desirability.

Sometimes query refinements are better than result diversity.

Consider the query “shoes”. It’s unambiguous, at least for practical purposes. But most people searching for shoes are looking specifically for men’s shoes, women’s shoes, or children’s shoes. They aren’t interested in seeing a diverse result set that jumbles all these kinds of shoes; rather, they want to see shoes that they (or the intended recipient) can wear.

For this and similar queries, search result diversity is not the right approach. Instead, searchers would benefit from query refinements that allow them to narrow their results to men’s shoes, women’s shoes, etc.

Result diversity works best for aspects that aren’t constraints.

There’s nothing wrong with offering faceted search refinements. But many searchers don’t see the choice of style, color, or material as a hard constraint. They may have preferences, but they want to see all the available options.

This is where search result diversity shines. Searchers would like to see a diverse selection of relevant results that showcase the variety across a few aspects (though not too many!). Some searchers may want to use faceted navigation to treat one or more aspects as constraints. But search result diversity doesn’t force searchers to treat aspects as constraints.

There’s a trade-off between result desirability and result diversity.

At the same time, the point of search result diversity is to serve a variety of preferences — to reflect not only the differences among searchers, but also the desire of an individual searcher to have diverse options. Even if brown loafers are the most desirable color-style combination, we wouldn’t want to show a whole page of them to everyone who searches for “men’s shoes”.

Instead we want to balance result desirability against result diversity. But how do we achieve an optimal balance between these two competing objectives?

Use a framework to parameterize the balance between objectives.

At one extreme, all that matters is result desirability. If diversity doesn’t matter at all, then the search engine can simply return relevant results in descending order of their desirability scores.

At the other extreme, diversity completely dominates desirability — that is, the most important concern is achieving a target distribution of aspect values, regardless of the desirability of any particular result. In that case, search should return an appropriately stratified sample of results.

Note that the target isn’t necessarily a uniform distribution, e.g., some colors or styles may receive larger allocations than others. The target distribution could be based on historical searcher behavior, e.g., purchase behavior.

Use a greedy algorithm to manage the trade-off.

Achieving an optimal trade-off is computationally expensive, since it involves exploring a combinatorial explosion of options; but a practical alternative is to use a greedy approach that starts with the original order of results ranked by desirability and then sequentially reranks them to improve their diversity by reducing the divergence of the top-ranked results from the target distribution.

To sum it up, be thoughtful about search result diversity.

Search result diversity is a deceptively subtle topic. It doesn’t address ambiguous search queries. Nor is it helpful for aspects that represent searcher constraints, which are better served by query refinements. It’s useful for aspects that don’t represent constraints, but it still requires managing a trade-off with result desirability, as well as managing computational resources.

By all means, implement search result diversity when it’s appropriate. It can vastly improve the search experience. Just be thoughtful about it.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store