Searching three times faster: multi-queries and faceted search. Automatically sort search results by degree of match with your query. Show the last level of hierarchical taxonomy in multifacets

Facet API Bonus for Drupal 7 is a collection of additional Facet API plugins and functionality, primarily filter and dependency plugins - and a place to store most additional extensions Facet API.

Currently Facet API Bonus includes:

Facet Dependency: dependency plugin to make one facet (eg "product category") to show depending on other facets or facet-specific active items (eg "content type" is "product" or "service") . Very flexible, supports multi-facets, for dependencies as well regular expression to determine the dependencies of a facet item, as well as a parameter for how to behave if the dependency is lost.

"Exclude Items": Filter plugin to exclude certain faceted items by their markup/name or internal value (for example, excluding "pages" from "content types") Regular expressions are also possible.

(To be continued...)

How to show faceted search on non-faceted pages

There are times when during development of a website it is necessary to display search facets on non-faceted pages. For example, you want the most popular categories to be displayed on home page, along with the corresponding counters. When a site visitor clicks on a category, he should be redirected to a site search, with the clicked category selected in the facets.

To do all this there is a block display Views Facets provided by the search_api_facetapi submodule

With it, you can easily use faceted links for any search on any page you want. For this purpose there are two options:

Show facets directly in the view block

It is necessary to either create a new view based on search index, which you want to use, or change the old view. Add a "Facets block" display type to view and configure the "Facet field" and "Search page path" in the "Block Setup".

Facets will not be output to preview, so to test your view, enable your new view facet block by going to Administration > Structure > Blocks, as you normally do for all other blocks. Configure the block so that it appears on those pages on which you need to display it.

Set up any (contextual) filters that you want to apply to the result that is used for faceted links (for example, only consider categories for published nodes of a certain content type). Exposed filters are ignored. Format and margin settings are ignored, but you can also change the settings by clicking on "Other".

A list of faceted links will be displayed on the pages that have been specified.

However, there are some limitations to this option:

Faceted links will always be formatted as a list; it will not be possible to overwrite or format the selection.
In particular, it is also not possible to use regular Facet API widgets for these facets. This is because it is quite difficult to achieve reuse Facet API components outside Facet API.
If you want to display facets for more than one field, you must add additional _view displays for the facet block and each will perform a separate search query when displayed.

To resolve these shortcomings, there is a second option:

You can also just use the facet block display to do the search, and then use the normal Facet API for facet blocks to display the facets in them. To do this, create a facet block display as before, but set the "Hide Block" option in the block settings. This way, on pages where the block appears, the search query will be executed, but the block will be hidden.

You can now create facet blocks for this search as usual, in the index page under the Facets tab. You just have to make sure that the facet is active for the facet block display (usually the ID will be search_api_views: -facets_block) in "Display for searches" / "Search IDs". Then, that's it regular options The Facet API can be used to define the appearance and behavior of rendered facet blocks. Facets in these blocks will automatically use the path given in the view facet blocks in the "Search Page Paths" settings.

However, there are also some problems with this approach: First, it is currently not possible to infer the same facet in two different places(#1411160: Allow a facet to appear multiple times in the same sphere). Thus, the list of facets on the main page should be in the same area and location for search and be styled the same. Getting around this either way (either by cloning the faceted block, or by altering the display to move the block) currently still requires coding (unless you use panels (or something similar) to display the blocks).

The second issue is the same as described in the Search API FAQs - in order for faceted blocks to work, you must ensure that the view is executed before the blocks are rendered. However, since you can put the view block in any region, (since it won't be rendered anyway), this shouldn't be too much big problem to find the position at which this requirement is considered fulfilled.

Adding a facet block to a panel

Why might everything described above about facets, facet search and facet blocks not work?

It turns out that when facet blocks are hidden and placed in the same region, they must also be located higher than the facet filters displayed in the same region. If the blocks are lower, then nothing works. As soon as you move facet blocks above the filters and hide them, facet filters begin to appear on all pages.

Show the last level of hierarchical taxonomy in multifacets

A useful feature will automatically split the hierarchical dictionary, after which each level will receive its own facet block and custom label. It seems this module already has an option to rewrite a single facet. Therefore, we are already halfway there. How difficult would it be to generate a facet block for each level?

This option will be especially useful for product categories where terms can have multiple parents, for example, “size” or “color”. If they are displayed in separate blocks, it is much more clear to the user how to navigate the facets instead of navigating a huge facet tree where terms are displayed multiple times...

Xhit useful information about all this here

Translations are based on the following pages.

We released new book"Content marketing in in social networks: How to get into your subscribers’ heads and make them fall in love with your brand.”

Subscribe

Faceted navigation is a problem for all e-commerce sites. Excessive number of pages that are used for different options of the same element poses a threat to search efficiency. This can negatively impact SEO and user experience. Experts from the SEO Hacker blog explained what faceted navigation is and how to improve it.

Faceted Navigation: Definition

This type of navigation is usually found in the sidebars of websites. ecommerce, contains filters and facets - parameters that the user configures as desired. allows online store customers to search for the product they want using a combination of attributes that will filter products until users find what they need.

Facets and filters are different from each other. Here's the difference:

  • Facets are indexed categories. They help refine product listings and act as extensions of core categories. Facets add unique meaning to each choice the user makes. Since facets are indexed, they must send relevant signals to the search engine, ensuring that the page contains all the important attributes.

  • Filters are used to sort and refine items within lists. They are necessary for users, but not for search engines. Filters are not indexed because they do not change the content of the page, but only sort it in a different order. This results in multiple URLs having duplicate content.

Potential problems

Each possible facet combination has its own unique URL. It can cause some problems from an SEO perspective. Here are the main ones:

  • Duplicate content.
  • Waste of budget on scanning.
  • Eliminate link differences.

As your site grows, so does the number of duplicate pages. Incoming links may go to various duplicate pages. This reduces the value of links and limits the ability of pages to rank.

The likelihood of keyword cannibalization also increases. Multiple pages try to rank for the same keywords, resulting in less consistent and lower rankings. This problem could have been avoided if every keyword was intended for a single page only.

Faceted Navigation Solutions

When choosing a solution for faceted navigation, consider your end goal: increasing the number of pages you index or reducing the number of pages you don't want indexed. Here are some solutions that may be useful for you:

AJAX

If you use AJAX, a new URL is not created when the user clicks on a facet or filter. Since there will be no unique URLs for every possible facet combination, the problem of duplicate content, keyword cannibalization, and wasted indexing costs is potentially eliminated.

AJAX can only be effective before the e-commerce site is launched. It is not used to solve problems of existing resources. This method also requires certain expenses on your part.

noindex tag

The noindex tag is used to exclude bots specific page from the index. This way it won't show up in Google search results. This helps reduce the amount of duplicate content that appears in the index and search results.

This won't solve the crawl budget problem because bots will still visit your page. It also doesn't help distribute the value of the links.

The rel=canonical attribute

With this attribute, you tell Google that you have one main preferred page to index and rank, and all other versions of content from that page are just duplicates that don't need to be indexed.

Sofia Ibragimova

Content Marketer

If the same page on your site can be reached from multiple URLs, search robots will treat each URL as separate page. Bots will decide that the content on your site is not unique, and this will negatively affect rankings and reduce your position in search results. To avoid this, indicate the main canonical page by inserting it into the HEAD block the following sequence characters:

you can use canonical pages, to solve the problem of duplicate content, and the share link will be merged with your main page. But there is a chance that bots will still crawl duplicate pages, which is a waste of crawling budget.

Robots.txt

Blocking some pages from indexing allows you to achieve good results. It's simple, fast and reliable way. The easiest way to do this is to set a custom option to specify all possible combinations of facets and filters that you want to block. Include it at the end of each URL you want to hide (http://full page address/robots.txt) or use the Robots meta tag in the HEAD area of ​​the page code.

When making changes to the URL, keep in mind that it takes 3-4 weeks for robots to notice and respond to these changes.

There are certain problems here too. The value of links will be limited, and a blocked URL may be indexed due to the presence of external links.

Google Search Console

This great way temporarily fix your problems while you work on creating a better and more convenient system navigation. You can use the console Google Search to tell the search engine how to crawl your site.

  • Sign in account console and select the “Crawl” section:

  • Click on the “URL Parameters” button:

  • Indicate the impact of each of your parameters on the page and how Google should treat those pages.

Remember that this method only hides duplicate content from search engines. Google robots. Pages will still appear in Bing and Yahoo.

How to Improve Faceted Navigation

Let's briefly consider all the methods that allow you to create the correct faceted navigation:

  • Using AJAX
  • Remove or hide links to categories or filter pages that are missing content.
  • Allow indexing of certain combinations of facets that have high volume of search traffic
  • Setting up a site hierarchy via bread crumbs in categories and subcategories.
  • Creating canonical (main) pages for duplicate content.
  • Consolidate indexing properties from component pages across the entire series using page markup with rel="next" and rel="prev" .

Conclusion

Each of the solutions mentioned has its own advantages and disadvantages. Universal solution does not exist, it all depends on the specifics of your business and the specific case. Optimized faceted navigation will allow your site to target a wider range of keywords. To avoid risk, make sure your navigation not only meets the requirements search robots, but is also convenient in terms of user experience.

In today’s article I’ll tell you about the Sphinx feature called multi-queries: its built-in optimizations, implementation of etc. facet search, and in general how sometimes you can use it to make a search three times faster.

But first, 15 seconds of political information (you can’t praise yourself, no one will praise you). This year, Sphinx qualified in the SysAdmins and Enterprise categories (they say it missed out just a little in the Developers category). Voting will continue for another week (until the 20th). Except work email addresses, nothing needed. Thanks in advance to everyone who won't let us go to waste!

And back to development. What are multi-requests anyway, and where does the promised three times faster speed come from?

Multi-queries is a mechanism that allows you to send multiple search requests in one packet.

API methods that implement the multi-request mechanism are called AddQuery() And RunQueries(). (By the way, the “regular” Query() method uses them internally: it calls AddQuery() once, and then immediately RunQueries()). The AddQuery() method saves the current state of all query settings set by previous API calls and remembers the query. The settings of an already remembered request will no longer change, any API calls will not touch them, so for subsequent requests you can use any other settings (another sorting mode, other filters, etc.). The RunQueries() method actually sends all the stored queries in one batch and returns multiple results. No restrictions are imposed on the requests participating in the package. The number of queries, just in case, is limited by the max_batch_queries directive (added in 0.9.10, previously fixed at 32), but this is generally only a check against broken packets.

Why use multi-queries? Generally speaking, it all comes down to performance. Firstly, by sending requests to searchd in one packet, we always save a little resources and time by sending less back and forth network packets. Secondly, and much more importantly, searchd gets the opportunity to perform some optimizations on the entire batch of queries. Over time, new optimizations are gradually added, so it makes sense to send requests in batches whenever possible - then when you update Sphinx, new batch optimizations turn on completely automatically. In the case where no batch optimizations can be applied, requests will simply be processed one at a time, without any visible differences for the application.

Why (more precisely, when) NOT to use multi-queries? All queries in a batch should be independent, but sometimes this is not the case, and query B may depend on the results of query A. For example, we may want to show search results from additional index only when nothing was found in the main index. Or simply choose a different offset in the 2nd set of results depending on the number of matches in the 1st set. In such cases, you will have to use separate queries (or separate packages).

There are two important batch optimizations worth knowing about: optimization general inquiries(available since version 0.9.8), and optimization of common subtrees (available since version 0.9.10, which is in development).

General Query Optimization it works like this. searchd selects from the package all requests for which only the sorting and grouping settings differ, but the full-text part, filters, etc. are the same - and searches only once. For example, if there are 3 queries in a package, the text part of all of them is “ipod nano”, but the 1st query selects the 10 cheapest results, the 2nd one groups the results by store ID and sorts stores by rating, and the 3rd query simply selects the maximum price, search for “ipod nano” ” will work only once, but from its results 3 differently sorted and grouped responses will be built.

So-called facet search is a special case for which it is applicable this optimization. In fact, it can be implemented by running several search queries with different settings: one for the main search results, several more with the same search query, but different grouping settings (top-3 authors, top-5 stores, etc.). When everything except sorting and grouping is the same, optimization is turned on and the speed increases quite well (example below).

Optimizing shared subtrees even more interesting thing. It allows searchd to exploit similarities between different queries within a batch. Inside all those who came are separate - different! - full-text queries identify common parts, and if there are any, intermediate results calculations are cached and shared between requests. For example, in this package of 3 requests

Barack obama president barack obama john mccain barack obama speech

There is a common part of 2 words (“barack obama”), which can be calculated exactly once for all three queries and cached. This is exactly what shared subtree optimization does. Maximum size The cache for each pack is strictly limited by the subtree_docs_cache and subtree_hits_cache directives, so that if the common part “i am” is found in one hundred million documents, the server will not suddenly run out of memory.

Let's go back to optimization about general queries. Here's a code example that runs the same query, but with three different modes sorting:
sorting modes:

Require("sphinxapi.php"); $cl = new SphinxClient(); $cl->SetMatchMode(SPH_MATCH_EXTENDED2); $cl->SetSortMode(SPH_SORT_RELEVANCE); $cl->AddQuery("the", "lj"); $cl->SetSortMode(SPH_SORT_EXTENDED, "published desc"); $cl->AddQuery("the", "lj"); $cl->SetSortMode(SPH_SORT_EXTENDED, "published asc"); $cl->AddQuery("the", "lj"); $res = $cl->RunQueries();

How do you know if the optimization worked? If it worked, in the corresponding lines of the log there will be a field with a “multiplier”, which shows how many requests were processed together:

0.040 sec x3 the 0.040 sec x3 the 0.040 sec x3 the

Pay attention to “x3”, this is exactly it - it means that the request was optimized and processed as part of a package of 3 requests (including this one). For comparison, this is what the log looks like, in which the same requests were sent one at a time:

0.059 sec the 0.091 sec the 0.092 sec the

It can be seen that the search time for each query in the case of a multi-query has improved from 1.5 to 2.3 times, depending on the sorting mode. In fact, this is not the limit. For both optimizations, there are known cases where the speed improved by 3 or more times - and not in synthetic tests, but quite in production. Optimizing general queries fits quite well with vertical searches for products and online stores, a cache of common subtrees similar to data mining queries; but, of course, the applicability is not strictly limited to these areas. For example, you can do a search without the full text part at all and read several different reports (with different sorting, grouping, etc.) using the same data in one request.

What other optimizations can we expect in the future? Depends on you. So far, the long-term plan includes clear optimization for identical queries with different sets of filters. Do you know another common pattern that can be cleverly optimized? Send it!

World-renowned experts in the field of usability and UX. Every few years, they survey search performance on e-commerce sites and share the results on their blog. The last study was conducted in 2017. Especially for you, we read the article with its description, translated it and formulated practical conclusions that will help you improve search on your own website.

Search algorithms

Support advanced search operator "quotes"

NNGroup write that most visitors to online stores do not know how to use advanced search operators. If they want to find a cat toy, they won’t search for “cat AND toy” to see all the products that have both keywords in the description. Therefore, it is not necessary to support such complex search queries.

Quotation marks are the only exception. If you enclose a phrase in quotation marks, the search will be based on a complete match with the phrase. This operator is used in Google search, and is widely known among advanced Internet users.

Automatically sort search results by degree of match with your query

IN search results Those products that satisfy all or most of the query keywords should be visible first.

Example. In previous studies, users of The Container Store website complained about inaccurate search results on the site. One user wanted to purchase a set of stainless steel storage containers with a clear lid. Upon requesting “steel glass container,” he received toilet brushes and glass jars. The user had to reformulate the search query several times, but without success.

Problem search engine on the site was that the search results displayed all products that matched at least one search word (“steel”, “glass” or “container”), not sorted by the degree of match with the original request. A product that matched all three keywords could be anywhere in the list, not necessarily at the beginning. The site was subsequently updated search algorithm: Now at the beginning of the search results you can see products that match all or most of the query keywords.

Improved search results on containerstore.com: the first result for steel and glass canister matches the user's needs

When sorting results by product rating, consider its weighted value, not the average.

By sorting products by average customer rating, users don't want to see products with only one rating, even if it's 5 stars. People don’t want to stumble upon a custom review, and the average rating of a product based on a couple of reviews makes them suspicious. When sorted by weighted rating, a product with an average rating of 4.9 out of 5 and 342 reviews will be ranked higher than a product with an average rating of 5 out of 5 and 3 reviews. This way the user will be able to get an objective idea of ​​the popularity and quality of the product.

Design and position of the search bar

Display the search bar in one block with the navigation menu in the site header

This arrangement of the search bar is found on many sites, so users already know where to look to find it. In addition, displaying the search string in one block with navigation menu solves many problems, for example, with its absence on some pages of the site and the need to additionally repeat it on the search results page.

On the Wildberries website it is large and clearly visible search string is located right in the site header

Display a search bar and a magnifying glass icon on the screen

When visitors to online shopping sites want to use search, they look for a wide empty field or a magnifying glass icon. At the same time, it is no longer necessary to explicitly sign the search string and call it “Search,” although it won’t hurt.

Many sites in versions for smartphones successfully use the magnifying glass icon and do not show the line itself, which allows them to save space on the screen. But if your site's sales depend on search, it's better to display the search bar right away, even on small screens. This is especially true for PC versions of websites, where there is more than enough screen space. Use an empty field with a “Find” button or a magnifying glass icon. The field should be visible on every page.

Narrow your search results

Don't use advanced search or category search unless you're Amazon

In the past, many online stores used advanced search and category search features to help users narrow down the number of items in their search results. However, people don't really use advanced search and often get confused when searching by category, so these features have gradually fallen out of fashion.

Such advanced search methods are now only available on those sites where they are truly useful. These are either sites with special search scenarios, like eBay, or online stores with a huge amount products such as Amazon and Wal-Mart.

In other cases, it is better to use faceted search. His key difference from category search is that users narrow the selection of products AFTER they have received results by search query, not BEFORE.

Use faceted search

Faceted search allows users to narrow down their search results using filters based on the attributes of the products users are viewing. If previously faceted search was a nice addition to an online store, now users are so accustomed to it that they look for it on the site and express dissatisfaction if it is not there. Nowadays, e-commerce sites without faceted search are the exception rather than the rule.

Faceted search on the website of the Utkonos online store: filters on the left allow you to narrow down the results

Autocomplete in the search bar

Support autocomplete feature

The autocomplete feature is that as the user types a word in the search bar, he sees recommended queries in a drop-down list. If a request from the list suits the user, then this saves him time and also helps to avoid typos and other errors.

The autocomplete feature was present on most of the sites NNGroup studied. At the same time, the study showed that users chose options from the list of proposed ones not so often - in only 23% of cases. Typically, they would just continue typing their query.

However, autocompletion is useful. Even if users don't select an option from the list, they can see and understand what products are available on the site and what other shoppers are looking for.

Support advanced autocompletion

Autocomplete searches containing recommended products, photos and other content in addition to the list of queries are a trend that is gaining popularity on some e-com sites. He appeared about five years ago, but quickly disappeared, and now he has returned to new form, reminiscent of a megamenu - a drop-down field with recommended query options takes up quite a significant amount of space on the screen.

Search with advanced autocompletion in the Labyrinth online store

NNGroup's research has shown that this feature works best on sites with a variety of product categories or products that are visually very different from each other.

Basic search problems

Key problems that make searching on the site difficult:

  • discreet search function: for example, hidden behind a small magnifying glass icon on big screen or on the hamburger menu mobile version website;
  • an insufficiently “smart” search string that is unable to process typos, errors or synonyms for query keywords;
  • non-standard display of results (switching between pages, sorting, filtering);
  • poorly thought out filters (irrelevant attributes, poor functionality, empty results).

How USABILITYLAB can help improve search on your online store website

This concludes our analysis of the NNGroup article. We hope it was useful to you.

If you would like to evaluate or improve search on your site, please contact us. We will carry it out. For usability testing, we will involve representatives of your target audience. They will work on your website under the supervision of our expert. Our laboratory is equipped with a one-way mirror, so you can also be present during testing and see everything that respondents do. Based on the testing results, we will draw conclusions about how well the search on your site meets the needs of users and formulate recommendations for its improvement, which you can pass on to your developers.

To learn more about our services, leave a request on our website or write to Dmitry Silaev:

Organizes the so-called faceted search (faceted navigation) on the site. Its meaning is that search results can be refined using various characteristics of the material - author, type, term, date of creation, etc.

For example, if you have an online store selling electronic technology, and the user enters the phrase into the search audio player. On the results page, in addition to the results themselves, there will be facets:

- Chapter: audio equipment (54), computer technology (85)
- Brand: Apple (25), Samsung (68), iRiver (78)
- Availability in stock: yes (456), no (12)
- Price: 100-1000$ (45), 1000-10000$ (12)

etc. The number of products (nodes) that meet these characteristics will be indicated in brackets. By clicking on the links, the user will narrow the search results.


On the one hand, this is an alternative to expanded filters in Views, on the other, an alternative to the standard advanced search.

Installation

Facets section

In this section, you can specify which facets to use when searching. For example, allow you to select materials by taxonomy, date added, or author. The number of facets depends on the included modules.

Results page section

Display style- search results display style: Extracts means to display as in a normal search (highlighted text, author, date); Teasers means displaying teasers of materials using the appropriate node.tpl.php.

Use the Extracts display style selectively- If the option is checked, then the style Extracts will always be applied if a keyword is entered. If you do not check this option, you can use the module as a replacement for navigating taxonomy terms.

Current search section

Allows you to turn on the block Current search , which displays the search terms: