Shopping is Hard, Let’s go Searching!

Daniel Tunkelang
4 min readJul 23, 2018

In 1992, Mattel released the infamous Teen Talk Barbie, most known — and rightfully mocked — for saying “Math is hard, let’s go shopping!” Actually, that’s a paraphrase, but you get the idea. Fortunately, Mattel has evolved a bit: now Barbie is teaching kids how to code.

Regardless, I feel she got it all wrong — and least when it comes to shopping online. Shopping is way too hard — at least if you want to find something you’re looking for. Fortunately, improving search for an ecommerce site isn’t so hard. It just requires some computer science and software engineering.

If you’re struggling with search for ecommerce, here are three tips:

1. Invest in associating structured data with your products.

The search box for an ecommerce site may look like Google’s search box for the web, but the similarities end there. Web pages are loosely structured text documents, while products have a rich structure that should be deeply connected the search experience. The more structured data you can associate with products, the better you’ll be able to deliver relevant results to shoppers.

How do you associate structured data with products?

Sometimes you already have it lying around, and it’s just a matter of making it available through database joins. You may need to do a bit of data cleansing with regular expressions or other simple rules.

Other times, you can mine structured data from your unstructured data. You can train a machine learning model by representing the product descriptions — and even the product images — as feature vectors and obtaining labels for a sample of your catalog from human judges. And you may not need to reinvent the wheel if someone has already developed a suitable taxonomy or ontology.

Finally, you can infer structured data from behavioral data. For example, you can associate products with the search keywords that shoppers use to find and buy them. By analyzing these keywords, you can infer structured data for popular products — and then extrapolate the analysis to similar, less popular products. But be aware that this process suffers from presentation bias: shoppers can only find the products that your search engine presents to them.

2. Invest in query understanding.

Search engine developers often equate search relevance to result ranking and thus invest most of their efforts into improving ranking. Query understanding is about what happens before the search engine scores and ranks results: it’s the process of establishing the searcher’s intent. Investments in query understanding, especially if it has been neglected, often provide larger and faster returns than efforts to improve result ranking.

The simplest form of query understanding for ecommerce is classifying (or scoping) queries into product categories. Using explicit human judgments or implicit judgments from behavioral data (i.e., clicks and conversions), you can train a machine learning model that classifies search queries into product categories. When the model classifies a query into a product category with high confidence, you can automatically restrict the search to that category.

More sophisticated query understanding involves segmenting the query into entities. You can then improve relevance by matching each parts of the query to the appropriate element in your structured data. Understanding the query as a collection of related entities also allows you to implement smarter query expansion and relaxation, increasing recall and avoiding an empty results page when nothing in your catalog matches the search query exactly. Query understanding models rely on machine learning — and again you can build the models yourself or leverage those built by others.

3. Pursue quick wins.

It’s easy to dismiss incremental improvements as boring or unimaginative. But moonshot visions don’t materialize overnight. Most innovation happens one step at a time, through a series of controlled experiments. It’s important to dream big but execute incrementally. For search in particular, it’s best to pursue improvements through targeted A/B tests.

What do quick wins look like? That depends on what you’re already good at. But here are a few ideas:

  • Analyze your 1,000 most frequent queries and see which ones are underperforming in terms of clicks or conversions. Figure out why, and see if there are patterns. To the extend that you have resources, develop curated, optimized landing pages for your most frequent queries.
  • Look at spelling correction. Estimates of the fraction of misspelled search queries vary, but a variety of studies place it between 10% and 15%. You’ll never have perfect spelling correction, but it’s an area where improvement can mean the difference between a shopper finding something vs. nothing.
  • Invest in autocomplete. Autocomplete isn’t just a way to help searchers type less. It also guides searchers, helping them express their intent through better queries. Make sure that autocomplete is suggesting queries that are not only popular, but also have good performance.

Summary

Shopping online shouldn’t be so hard. But we all know that there’s room for improvement in ecommerce search. If you’re running an ecommerce site, I encourage you to invest in structured data and query understanding, and to seek out quick wins. Your shoppers and shareholders will thank you.

--

--