Thoughts on Search Result Diversity

Search result diversity is not for ambiguous queries.

Let’s start by clearing up a common misconception: search result diversity is not the right way to address query ambiguity. Ambiguous queries are different from broad queries: ambiguous queries have unclear intent (e.g., does “mixer” refer to a kitchen appliance or an audio component?), while broad queries are unambiguous but underspecified (e.g., “shoes”).

Sometimes query refinements are better than result diversity.

When is is search result diversity appropriate? We’ve already explained that it’s not good for ambiguous queries, since a result set should not try to hedge between unclear, mutually exclusive intents. But some broad queries, even though they are unambiguous, create a similar challenge.

Result diversity works best for aspects that aren’t constraints.

Let’s turn to the query “men’s shoes”. Men’s shoes come in many styles and colors. Should we present these as query refinements?

There’s a trade-off between result desirability and result diversity.

We’ve established that, for men’s shoes, result diversity is a great way to showcase the variety of styles and colors. But should we try to show all of these styles and colors on the first page of search results? Perhaps a few searchers want to buy lime green moccasins, but it’s likely that far more of them want to buy brown loafers. A completely random distribution of styles and colors would be unlikely to serve the majority of searchers well.

Use a framework to parameterize the balance between objectives.

Unfortunately, there’s no single answer on how to achieve an optimal balance. But there is a framework that you can use to parameterize it.

Use a greedy algorithm to manage the trade-off.

In between these two extremes, there is a trade-off between result desirability and result diversity. We can model this trade-off as a convex combination of desirability (e.g., a position-weighted average of the desirability of the ranked results) and diversity. We can compute the diversity score by penalizing the divergence (e.g., the Kullback-Leibler divergence) of the distribution of aspect values on the first page of results from the target distribution.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store