AddSearch excludes elements from the index to make the search results more relevant. For instance, AddSearch excludes sidebars, headers, and footers that rarely contain relevant content. The content of the elements may also be the same on all pages, which is not optimal in terms of providing relevant search results.
The text extraction rules allow you to include elements in the index that AddSearch excludes by default. The inclusion is useful if, for instance, you have a sidebar or other excluded element that contains crucial information associated with the rest of the page contents.
You can also use the text extraction rules to exclude elements from the index you don’t want to add to the search. For instance, your pages may have a cookie consent view or a newsletter popup that appears after the page is loaded. In addition to the unwanted contents, the elements may also appear in the screen captures displayed in the Widget view.
Text extraction rules are configured in the AddSearch dashboard and do not make any changes to your pages’ source code. If you would like to exclude or include certain elements of the page, another way of doing it is adding special attributes to your HTML template.
You can include or exclude elements with these three steps:
Please visit our documentation here for detailed instructions on identifying CSS selectors from web pages.
Once you’ve identified the relevant CSS selectors, set text extraction rules using the following instructions:
After making changes, a recrawl is required to update the index. You can either recrawl a single page or all the pages from your website.
We’re always happy to help with code or other questions you might have. Search our documentation, contact support, or connect with our sales team.