Search performance is critical to keeping low costs and high quality

product discovery features and search performance efforts

Business problem

When improving digital search engines, firms need to invest equally in product discovery features and search performance efforts so that search speeds are fast and costs are kept under control. Whenever firms add new product discovery features, there are new types of data, business signals and sophisticated queries for search engines to handle. Eventually, these changes can negatively affect response times, indexing duration and overall search data staleness. These accumulated problems will slow search, upset customers and increase search infrastructure costs. Therefore, companies need to control new features and ensure that they are properly supported by search performance engineering. 

Geometric shapes in different colors
Experts in search engines performance

Grid Dynamics started in 2006 with the goal of making mission-critical systems scalable. Since then, performance and scalability engineering have been at the core of our company’s DNA. As Solr and Lucene contributors, we have a deep understanding of search engine internals, and possess years of experience tuning the performance of search systems. For over 5 years, the search solutions that we have built for our retail customers have survived Black Friday traffic storms with flying colors, without a single outage or breach of SLAs. We have helped numerous customers solve their search performance issues. We improved response times, online conversion rates from search, and indexing speeds.

Taking a top-down approach to search performance

When it comes to performance engineering, we prefer to use a top-down methodology by starting our analysis with high level system metrics, then drilling down to understand the true root cause of the issues.

We ask ourselves three guiding questions: 

  • What computational resources are bottlenecking and preventing applications from going faster? We use deep system-level monitoring to find the true culprit. 
  • Where exactly does the bottleneck happen? We use sampling and profiling to identify hotspots and resource congestions down to the individual subsystems of the search engine. 
  • How can the bottleneck be removed? This is where our deep understanding of the internal workings of the search engine helps. It is often possible to slightly change the index structure, rewrite a query or tune caches to improve performance dramatically. 
  • What computational resources are bottlenecking and preventing applications from going faster? We use deep system-level monitoring to find the true culprit. 
  • Where exactly does the bottleneck happen? We use sampling and profiling to identify hotspots and resource congestions down to the individual subsystems of the search engine. 
  • How can the bottleneck be removed? This is where our deep understanding of the internal workings of the search engine helps. It is often possible to slightly change the index structure, rewrite a query or tune caches to improve performance dramatically. 

Search performance engineering methodology overview

Areas of performance optimization

Search index structure

We define effective index structure for entities and entity relationships. This includes document nesting, embedding, roll-ups and roll-downs, as well as other special indexing tricks. We also optimize field mappings and indexing options for performance metrics.

Indexing performance

We implement high throughput bulk indexing with document streaming. The indexing workload is optimized with change buffering, deduplication and coalescing, as well as with fast partial document updates. We also tune the index refresh rates and the index segments merge policy.

Query performance

We analyze query patterns in general, and perform deep profiling on slow query patterns. This helps identify bottlenecks and hotspots, and where to rewrite queries, as well as scoring and boosting optimizations. Additionally, we optimize faceting and aggregations, and fine tune all search engine caches.

Production environment tuning

Some features involve capacity planning and cluster topology optimization, which includes sharding, shard allocation, replicas, node roles, cluster state, discovery and fault detection. There are also tuning of thread pools, GC tuning and other JVM settings, as well as OS tuning for RAM buffers, index memory mapping and swappiness.

Get in touch

Let's connect! How can we reach you?

    Invalid phone format
    Please fill out this field.
    Submitting
    Search performance engineering

    Thank you!

    It is very important to be in touch with you.
    We will get back to you soon. Have a great day!

    check

    Something went wrong...

    There are possible difficulties with connection or other issues.
    Please try again after some time.

    Retry