Searching grandchildren and siblings with Solr block join
Aug 26, 2016 • 2 min read
We’ve talked about searching nested parents and children with Solr Block Join. But we can go far beyond that, to searching siblings, grandchildren, and other descendants. This gives your customers more search options without forcing them to do a new search every time they add a search parameter. In fact, with Solr Block Join, nesting depth and search option possibilities are almost unlimited, so your customers can see more of what you have to offer with fewer clicks than it would take without Block Join. Here’s how it works:
Let’s take Solr 4.5 or above and index the data. The data structure here is a bit more complex than some examples we’ve seen, so it’s worth a diagram:
As you can see, this hierarchical structure is similar to entity-relationship models from the RDBMS world. We name those nested entities “scopes.”
Once data is indexed, use q=*:*&wt=csv&rows=100 to see how the documents are aligned in blocks.
Grandchildren search
Let’s consider a search for a t-shirt product (parent) which has a particular SKU (child) that has sufficient inventory available in a particular storage location. We model a storage location as a child of the SKU scope, and a grandchild of the product scope.
Here is a search for t-shirts, of which have sufficient inventory (>10) of Blue XL SKUs in CA:
q={!parent which=type_s:product}+COLOR_s:Blue +SIZE_s:XL +{!parent which=type_s:sku v='+QTY_i:[10 TO *] +STATE_s:CA'}
Here’s a trick recommended by David Smiley; when a child query contains a space you need to wrap it into {!… v=’..’} local parameter, or extract it into a separate request parameter and refer to it by {!… v=$ref}…&ref=…&.
You can see that a crossmatch is excluded; this query returns products 20 and 30. You can remove either QTY_i filter or COLOR_s, which brings product 10 into the results.
Needless to say, possible nesting depth is unlimited. One more interesting observation about Block Join is that it provides blazing-fast transitive closure on parent-child relationships: you can search for grandchildren and deeper descendants directly, omitting queries for intermediate scopes.
Sibling scopes
Vendor and SKU scopes share the same parent product and are not nested in each other.
Let’s search for t-shirts which are made by Vendor Bob and cost between $20 and $25. Here, a local parameter reference is necessary:
q=+{!parent which=type_s:product v=$skuq} +{!parent which=type_s:product v=$vendorq}&skuq=+COLOR_s:Blue +SIZE_s:XL +{!parent which=type_s:sku v='+QTY_i:[10 TO *] +STATE_s:CA'}&vendorq=+NAME_s:Bob +PRICE_i:[20 TO 25]
As you can see, it returns only product 20, and if you relax the query, e.g. choose vendor Alice or accept more expensive t-shirts, product 30 appears. It works like relational calculus!
If you have a question about Block Join in Solr, please post a comment below or contact us via email.