Configuring spelling
For website collections, spelling is configured automatically and requires no additional set up. Learn more about how spelling correction works in the user guide.
For e-commerce and app collections an initial spelling configuration is created during the console onboarding process. You can further customize your spelling model through your record pipeline.
Training the spelling system
Every time a record is created in the index your spelling model can be trained from specified fields in that record.
Start by adding the train-spelling
step to your record pipeline. The example below is the default configuration for a website collection.
Now you need to assign to this step the name of the fields which you want to train spelling model on. Each field should be given a label. The format is field:label
In the example above we assign each field name the same label name i.e. description:description
, title:title
.
Use the fields:constant
parameter to define which fields will be used to train the spelling model for individual words. In the example above the description
field is ideal as it contains many individual words that reflect what a user may type in their search query
Use the phraseFields:constant
parameter to define which fields in your data consist of phrases to train the spelling model. The keywords
field commonly used by websites and e-commerce can be ideal as it consists of short phrases i.e. 'king duvet', 'winter bedding' etc whilst category
fields are also a good candidate with phrases like 'outdoor furniture' and 'children's bedding'.
A field like brand
could be a good candidate for both fields:constant
and phraseFields:constant
as some brands are single words, whilst others are multiple. In this scenario a different label must be used like the example below:
See our Phrase training guide for more information
Within your model every word and phrase will have a count representing the amount of times it has been found during the training process. For a word or phrase to be considered 'spelt correctly' and used as a candidate for spelling it must have a count of 5 or greater
Language processing:
The default language used to train spelling is English. The language used to train your spelling model determines how information is processed as your spelling model is built. Processing Japanese characters requires different processing logic to Roman based languages like English, French, and Italian. You can change the language by including the input:lang
parameter.
You may not want any language processing to be performed on some fields. Fields such as model numbers and part numbers are usually best left unprocessed. The following example shows how you can use the set-param-values
step to set the processing language to ‘zxx’ (no language processing) for specified fields.
Using multiple languages and spelling models
You can train spelling for multiple languages and have a separate spelling model for each language. The following example shows how you can train spelling for English and “zxx” for model numbers together.
Enabling spell correction
Now that your spelling model has been trained you can enable it by adding the index-spelling
step in your query pipeline. The following example shows how to pass the users query q
for spelling correction using a default configuration:
It is highly recommended to set the phraseLabelWeights:constant
to tell the spelling system which labels correspond to phrases. This will increase the accuracy of spelling correction when correcting multiple word queries.
Weights are used to influence the score assigned for a phrase suggestion. Different labels can have different weights. The example below configures the weight of the user query training to be twice as important as the brand
and category
record fields.
A weight can also be applied to the overall spelling model. The default and maximum weight is 1
. The example below halves that by assigning 0.5
. Assigning a lower weight can be beneficial if you find the suggestions that Spelling submits have too much impact on the search results.
The Relevance screen can be used to run test queries where you can assign the ideal weight according to your spelling model.
See Spellings suggestions
Now that you've configured your spelling model I imagine you want to see what suggestions it produces!
In the Relevance screen Advanced mode add export-phrases
as a post step
Underneath the search bar click the Raw
tab and then submit a query in the search bar. In the example below we simulate a user misspelling 'winter titghts'.
We know that spelling has made a phrase suggestion due to phraseSuggestions
being populated. In the phrases
parameter we can see the users original query along with Spellings phrase suggestion are being submitted as the final search query.
The absence of the phraseSuggestions
parameter tells us that Spelling has made no phrase suggestions. The phrases
parameter will contain individual word suggestions made by Spelling. Below is the output for a user query of 'long winter titghts' where no phrase suggestions are made.
Last updated