Boosting product discovery with semantic search
Mar 13, 2020 • 14 min read
Search engines have been with us for several decades already, and we are using them daily in our digital lives, so why are we still talking about search? Isn’t search a solved problem already?
Search probably would have been a long solved problem if human language were a well-defined, thought-through, and strict construct. If it were, our daily communication would be fast, clear, precise — and extremely colorless and dull. So, unfortunately for search engineers and fortunately for everybody else, human language is an extremely complicated system riddled with redundancies and ambiguities. For example, consider this:
One is a dress shirt, and the other is a shirt dress.
Same words, completely different meanings just by changing the order of words!
People quickly navigate this complexity, relying on the context of interaction and vast background knowledge about the world. They expect modern search engines to be as smart in understanding their requests, to base their understanding not just on the accidental matching of words, but on their meaning, also known as semantics. In other words, people want a semantic search.
In this blog post, we discuss what language phenomena semantic search has to understand and how to approach semantic search with some of the powerful information retrieval techniques.
What semantic search brings to e-commerce
The semantic search extends the classical boolean retrieval model which was used as a baseline for full text search engine implementations. It addresses many of its term-centric approach limitations with better support for peculiarities of natural language.
In general, semantic search aims to match documents that correspond to the meaning and intent of the query, not just its words like the full text search approach used to do. Because of this, semantic search departs from traditional ways of matching and counting words and focuses on matching concepts. Its advanced techniques allow it to improve precision compared to full text search without worsening its recall.
Originally full text search was designed to work with text documents, and its user interface allowed only one way of search results navigation – scrolling from top to bottom. Under these circumstances, seaerch engine could sacrifice precision in favor of recall. Suboptimal precision can be mitigated by advanced ranking which boosts most relevant documents to the top. This approach was generally acceptable, because the majority of users rarely went through enough pages to see irrelevant ones. This is what we still see in web search engine interfaces (even though their internal implementation has gone far away from classical full text search) – millions of matched results, but we rarely go beyond the first few pages.
As for e-commerce search, it contains additional elements of customer interaction, which can easily reveal irrelevant products and spoil the experience, even if the relevance-based ranking is doing a very good job:
-
Alternative sortings
E-commerce sites allow customers to re-sort products by price, newness, sales or ratings, in other words – not by search relevance. If the original result set contained irrelevant products buried by the ranking formula or business rules, there is always a risk that these products will pop up when alternative sorting is used. The typical example of this issue includes searching for a product type, sorting it by price and getting other product types (accessories of the desired product type) on top positions.
-
Facets
When irrelevant products are buried by well-tuned relevance ranking, they still contribute their values to facet filters. Because of those products, the result can contain irrelevant facet values or even the whole facets, which will confuse customers.
There can be two kinds of issues with different degree of visibility and customer frustration:- Screenshots below contain examples of obviously irrelevant values: pillows don’t have bra sizes and shirts can’t have waist/shoe sizes. Such issues are easy to notice, and arguably they contribute to some frustration on the customer side.
-
However, more subtle examples can result in much higher frustration. Sometimes irrelevant products in the result set may produce relevant-looking facet values. In this case, filtering by those values will lead to irrelevant results, which causes much frustration in the shopping journey. Imagine filtering pillows by color but getting bras products back or filtering shirts by brand but getting pants products back.
Even if your customers prefer just scrolling down without filtering or re-sorting (which is very likely for mobile experience or products with high importance of visual factor, like fashion or home decor), poor precision still remains a frustrating issue. In the e-commerce domain, it is often very easy for a customer to detect irrelevant products just by looking at their images. If customers sees a lot of products, she will soon become disappointed by time wasted for scrolling through poor precision results.
Product data as a fuel for semantic search
Full text search algorithms were designed to handle loosely structured texts (articles, books, etc.), which contain a few headers and many paragraphs of texts.
On the opposite side, e-commerce search has quite a different data source: products are often well-attributed and each searchable value has a key which explains its meaning (brand, type, color, style, occasion, material, etc.)
This kind of input data becomes a great resource in semantic search implementation. To be honest, as a customer, I feel stunned when I see e-commerce sites with rich attribution (exposed to me via facets), but with poor search implementation.
Unfortunately, we cannot expect all aspects of customer intent to be always fully discoverable in key-value product attributes (even with help of synonyms). In reality there will always be scenarios when customers search for concepts located only in unstructured data (product names, descriptions or reviews). Because of it, the eventual solution is always hybrid, but if attributes are good enough to serve the majority of search requests, implementation of basic semantic search techniques can bring significant improvement. If unstructured data dominates, or quality of attribution is diverse across the catalog, it’s worth investing into attributes extraction: not only search, but also other aspects (like facet filters, product recommendations) will benefit from it.
Semantic query parsing
Product search
The main insight behind semantic search is that one cannot just simply break a search query into tokens and use them to match products in reverse index.
Instead of it, semantic search must adhere to following principles:
-
A single word can be a part of an unbreakable multi-word phrase – such phrase should be handled as the whole (it affects both parsing a user query as well as indexing values of unstructured text fields). For example, this approach helps to avoid false positive matches on parts of multi-word brands (“black” keyword doesn’t match on brand “black halo”).
- In case of ambiguous matching, conflicts between matching options should be resolved with respect to saliency of matched attributes. As a result of it, less relevant options can be not only buried down in ranking formula, but also filtered out.
- Business domain knowledge base is used both to enhance and to restrict the query options
Those principles can be applied by introducing the query understanding phase in the search pipeline.
This phase examines the query from different perspectives – and its findings help to build an optimal boolean query as well as ranking criteria for product search.
Depending on available data, building a boolean query can be implemented in different manner.
- In case of good attribution, the whole phrase can be mapped to attribute values, in which case only key-value filters are used to obtain products.
In order to achieve it, the owners of good attribution employ the semantic query parsing technique: they create the separate index with corpus of concepts and run a search query against it before searching for products. Those concepts include product attribute values (incl. popular non-carried ones), items of business domain knowledge base (synonyms, unbreakable phrases, etc.), keywords of business rules conditions, common syntactic patterns. In the absence of a direct match, semantic query parsing should employ normalization techniques such as stemming and spell correction.
As an example, when the phrase is “blue long sleeve dress”, this approach guarantees that only dress products with long sleeves are returned, but long dresses with any sleeve are excluded.
-
When attribution is incomplete and inconsistent, tokenized search is still used on unstructured text fields pretty often. In this case, query understanding can at least detect a category for a search phrase, so that products of other categories can be filtered out. It helps “red dress” phrase to return only red dresses, while other red “elegant” (a.k.a. “dress”) products are excluded (i.e. no red dress shoes and no red dress shirts in contrast with below example).
Customer intent classification based on machine learning algorithms can be employed to convert customer search history data into predictions of search phrase category or product type and improve precision.
Business domain knowledge base
Semantic search relies on business domain knowledge base, which contains different types of relations between terms relevant within a particular domain:
-
Synonyms – representing the same concept in many different ways:
- sofas and couches.
- ck and calvin klein.
-
Hypernym and its hyponyms – a word with broader or narrower meaning, where the broader word includes narrower word:
- fruit includes banana, apple, orange.
- shoes include oxfords, booties, sandals.
-
Polyseme – a single word with multiple meanings:
- silver as a color or as a material.
- mouse as an animal or a computer peripheral.
-
Alternative spelling – a word with multiple valid spellings:
- barbecue and barbeque.
- disc and disk.
-
Unbreakable phrase – a multi-word phrase where parts have different meaning in the same business domain:
- board game.
- shirt dress.
-
Syntactic structure – relations between words in a phrase controlled by special terms altering sentence meaning:
- dress with sleeves and dress without sleeves.
- lid for bucket and bucket with lid.
Merchandising rules
If you have a merchandising rule associated with a certain keyword (like a category redirect for couches), then this rule should fire not only for different spellings of this keyword (singular couch and its misspellings), but also for other keywords with same semantics (sofas). In practice it is often achieved by explicit listing of all keywords, their forms and even popular misspellings in the rule itself, but such approach doesn’t scale well.
Once you have semantic search for products, it makes sense to extend it towards evaluation of rule keywords, so that all implemented techniques can be leveraged there. Eventually this approach looks obvious, because in both use cases your final goal is to express customer intention using domain vocabulary.
However, there is one common pitfall that you’d better be warned in advance.
While hypernyms should be considered in product search, they would better be ignored in merchandising rules evaluation if such rule is intended to override the whole result.
- For example, if a customer searches for shoes and there are no rules yet, you’d better apply hypernyms and thereby return all kinds of shoes (oxfords, booties, sandals, etc.).
- But if a redirect rule exists for a certain type of shoes (like sandals), you don’t want such redirect to fire, as it will bring a customer to a category of this shoe type but hide all others.
The impact goes beyond search
Once you implement efficient query parsing, which can convert a text query into filters by attributes, then you can use it in other areas other than search. For example, it can be used to improve autocomplete phrases corpus: you can not only remove poorly matched or misspelled phrases, but also to prevent showing semantically similar phrases.
Semantic search for physical properties
If an attribute refers directly to some physical properties of a product, it can pose additional challenges for the semantic search engine, as people can use multiple ways to measure or explain the same physical characteristic.
Color search
Color search is a valuable use case in the domain of fashion, where the actual look of a product weighs heavily on customer decisions. However, product-to-color relation is not simple. There are different shades of the same color, or a product can use multiple different colors (as striped dress does). Below example shows how differently “blue dress” products can actually look.
Here one can rely not only on the attribute values, but also on computer vision techniques to resolve these issues. The model can extract the dominant and participating colors, and we can consider a color distance between the color mentioned in the query and the multiple colors associated with the product, giving the dominant color the maximum weight.
It can also include grouping products by their relation to this color: shade, coverage, patterns. etc.
Retailers have other use cases for color search. One of the more interesting examples is home improvement’s requirement to use color search for paint matching. A customer who wants to repaint an object like an interior wall, and is looking for paint used initially to match the existing color. Due to the variety of paint colors, the customer may not always use primary color names such as beige. They may use official name variations such as Tuscan beige or desert sand, or unofficial terminology, like light beige or off white. The search system should recognize those colors and return the closest paint products. In this case, the color search is a critical property, so it may be advisable to build distinctive UI elements specifically for paint selections on this site.
Size search
Size search has additional challenges in domains, where real physical sizes are used. The semantic search should be able to handle the following aspects of “dimensional” size search:
- one or multiple dimensions (height, width, and depth) which can be combined via different separators (2 by 4, 2 x 4, 2W 4L).
- both measurement systems (imperial vs. metric).
- different units of the same measurement system (1/2 ft, 6 inches).
- different symbols for the same unit (3 ft, 3 feet, 3’ ).
- different formats of fractions (.2, 1/5, 0.2)
- degree of required match accuracy (precise for pvc pipe diameter, but approximate for floor rug length/width)
In addition to it, Semantic size search cannot expect that every customer knows and obeys all above rules. So that when a customer makes a mistake (like using single apostrophe for something usually measured in inches), Semantic size search needs to handle it gracefully.
Age search
Age search becomes important for child-related products when using age as a measurement of different characteristics of a product, such as size for clothing or complexity for toys. Semantic search should have the ability to recognize ages formulated in different ways:
- labels vs. numbers (newborn, 0-3 months)
- months vs. years (12 months, 18 months, 1 year)
- exact values vs. ranges (5 years, 4-6 years, over 4 years)
Price search
A customer may search, not for a particular price, but a price range instead. A typical example is a search for products below some price value using patterns like under $20 or below $50. Semantic search needs to recognize these patterns and convert them to the appropriate filters by price range in a Boolean query.
Semantic partial match
Sub-queries conflict resolution
A semantic partial match is another example of semantic search benefits.
When the query doesn’t match any product exactly, we have to relax matching requirements, which can lead to multiple alternative sub-queries to be evaluated.
In the above example, the query “black torn jeans” couldn’t match any products with all its words, so that eventually different types of black products were returned (jeans, skirts and even rugs). As a customer, I can easily notice that rugs are very irrelevant for this query, so I would expect it to be not only buried but filtered out.
One needs to decide how to choose those subqueries which are too irrelevant to be used. Full text search approach could base filtering out sub-queries on words-related factors, such as a number of preserved words in a query or their index frequency. Both approaches are far from optimal. For example, if your site doesn’t carry “blue calvin klein shirts” products, but carries “blue shirts”, “calvin klein shirts” and “blue calvin klein” products, then “blue calvin klein” (3 preserved words) shouldn’t be considered as more relevant than “blue shirts” (2 preserved words), should it?
When making a decision what words to omit in the query, semantic search has to consider the saliency of the attributes of omitted words in each sub-query.
In our example, dropping the product type “shirt” is not a good idea (as customers are very unlikely to agree on getting other product type, but with same color or brand), so “blue calvin klein” interpretation should be filtered out. Other two options (“blue shirts” vs. “calvin klein shirts”) have similar relevance.
As a part of customer-oriented search experience, it is recommended to explain which words were omitted (so the customer can try to rephrase them) as well as group products by sub-queries (if more than one was eventually used for returned products).
“Do not carry” brands
When a customer is searching for brands not carried by your site, then just omitting a brand name in a query is not efficient. Instead, you should still be able to understand the intent of the search request and suggest a reasonable alternative. For example:
- The customer is searching for shoes of a brand known for athletic goods (Brooks shoes), but you don’t carry them – do not just show all the shoes you carry, but recognize the brand and suggest shoes of other athletic brands (Nike or Adidas shoes).
- The customer is searching for a movie title (Interstellar), but you don’t have it – do not attempt a text match on the movie description, but recognize the movie and suggest popular movies of the same genre (sci-fi).
It is a good practice to explain the substitution explicitly to the customer (we don’t have …, but we suggest …), so that the customer does not continue fruitlessly searching other phrases or categories, but explore provided suggestions.
Incorporating semantics into results ranking
When a customer is using a short query like “dress”, “watch” or “nike”, we have to deal with a situation when we have many products that seem to match the customer query perfectly. We may, after all, have hundreds of dresses, watches, or Nike products in the catalog. Which products should display on the first page? A short query is a common example of “head queries”, which are very frequent, quite short, and match many products equally well, so they get an equal relevance score from the search engine.
A popular approach is to break this tie by employing site-wide product-level business metrics such as sales, margins, inventory, rating, and combinations of thereof. Those metrics can produce a ranking score used as a secondary ranking criterion applied to break ties for equally relevant products.
However, while providing a solid baseline for the tiered relevancy ranking, it is not always the best thing we can do. Consider the following:
- When customers search for dresses, there are multiple types of dresses. When comparing a newer fit & flare dress with an older maxi dress, the particular maxi dress may have more sales just because it has been longer on the market, but a fit & flare dress gets boosted higher because you learned that the customers prefer this type of dress.
- When searching for dresses, there are dresses of different sizes (regular, plus, petite) and ages (adult, kids, toddlers). By default, when specifying no particular size or age in a query, a customer usually assumes regular and adult. Those assumptions are applied in product sorting to avoid showing petite or toddler dresses on top, even if they are better sellers overall.
It is essential to resolve ambiguities to employ an understanding of customer intent obtained from the clickstream. Using clickstream data, we can train both catalog-wide and personalized ML models to predict product types and resolve the query ambiguity.
Spell correction should respect semantics too
For doing spell correction it is not enough to merely identify an unknown word and correct it to a known word with the closest edit distance. It must be corrected with a word that fits into the context of adjacent words in a query. Moreover, such correction is required even if the site catalog knows the original word, but it still doesn’t fit the phrase context.
- The phrase “michael coors bags” should be corrected to “Michael Kors bags”, even if the web-site sells “Coors” beer as well.
- The phrase “rigid vac” should be corrected to “ridgid vac”, even if the word “rigid” exists in products data, because the customer’s intention was about the brand “Ridgid” in fact. In below example “ridid vac” is corrected to “ridgid vac” by addition of single letter “g”, but “rigid vac” is not corrected (even though it requires the same type of correction – addition of the single consonant letter into the middle of the word).
Conclusion and next steps
Proper implementation of semantic search helps you to establish a more efficient balance between precision and recall, improve query understanding and make customer search journey seamless and delightful.
However, the power of semantic search largely depends on the richness and quality of the domain data – product attribution as well as synonyms.
If your customers often perform out-of-dictionary search, then semantic search quality will suffer. It can include
- searches by subjective features like occasion of clothing (church dress) or age group for hi-tech device (laptops for kids)
- searches for brands which aren’t carried by your site, but it has similar products which can be suggested instead of just dropping the brand value from a query
Even though such use cases can be handled with the help of business rules and special synonyms, it is generally hard to scale. To deal with such tricky queries, we recommend to look into semantic vector search which can be used to augment capabilities of semantic search implementation.
Happy semantic searching!