Configuring spelling

For website collections, spelling is configured automatically and requires no additional set up. Learn more about how spelling correction works in the user guide.

For e-commerce and app collections an initial spelling configuration is created during the console onboarding process. You can further customize your spelling model through your record pipeline.

Training the spelling system

Every time a record is created in the index your spelling model can be trained from specified fields in that record.

Start by adding the train-spelling step to your record pipeline. The example below is the default configuration for a website collection.

- id: train-spelling
  title: Train spelling using words from fields in the record
  params:
    fields:
      constant: description:description,_body:_body
    phraseFields:
      constant: title:title,keywords:keywords,dir1:dir1,dir2:dir2

Now you need to assign to this step the name of the fields which you want to train spelling model on. Each field should be given a label. The format is field:label In the example above we assign each field name the same label name i.e. description:description, title:title.

Use the fields:constant parameter to define which fields will be used to train the spelling model for individual words. In the example above the description field is ideal as it contains many individual words that reflect what a user may type in their search query

Use the phraseFields:constant parameter to define which fields in your data consist of phrases to train the spelling model. The keywords field commonly used by websites and e-commerce can be ideal as it consists of short phrases i.e. 'king duvet', 'winter bedding' etc whilst category fields are also a good candidate with phrases like 'outdoor furniture' and 'children's bedding'.

A field like brand could be a good candidate for both fields:constantand phraseFields:constant as some brands are single words, whilst others are multiple. In this scenario a different label must be used like the example below:

- id: train-spelling
  title: Train spelling using words from fields in the record
  params:
    fields:
      constant: brand:brand
    phraseFields:
      constant: brand:brandPhrase

See our Phrase training guide for more information

Within your model every word and phrase will have a count representing the amount of times it has been found during the training process. For a word or phrase to be considered 'spelt correctly' and used as a candidate for spelling it must have a count of 5 or greater

Language processing:

The default language used to train spelling is English. The language used to train your spelling model determines how information is processed as your spelling model is built. Processing Japanese characters requires different processing logic to Roman based languages like English, French, and Italian. You can change the language by including the input:lang parameter.

You may not want any language processing to be performed on some fields. Fields such as model numbers and part numbers are usually best left unprocessed. The following example shows how you can use the set-param-values step to set the processing language to ‘zxx’ (no language processing) for specified fields.

- id: train-spelling
  title: set lang to zxx (no language) to train spelling for model number etc. 
  params:
    lang:
      bind: nolang
      defaultValue: zxx
    fields:
      constant: modelNumber:modelNumber

Using multiple languages and spelling models

You can train spelling for multiple languages and have a separate spelling model for each language. The following example shows how you can train spelling for English and “zxx” for model numbers together.

- id: train-spelling
  title: train spelling with english (default)
  params:  
    fields:
      constant: name:name,brand:brand,categories:categories,description:description
- id: train-spelling
  title: set lang to zxx (no language) to train spelling for model number etc. 
  params:
    lang:
      bind: nolang
      defaultValue: zxx
    fields:
      constant: modelNumber:modelNumber

Enabling spell correction

Now that your spelling model has been trained you can enable it by adding the index-spelling step in your query pipeline. The following example shows how to pass the users query q for spelling correction using a default configuration:

- id: index-spelling
  params:
    text:
      bind: q

It is highly recommended to set the phraseLabelWeights:constant to tell the spelling system which labels correspond to phrases. This will increase the accuracy of spelling correction when correcting multiple word queries.

Weights are used to influence the score assigned for a phrase suggestion. Different labels can have different weights. The example below configures the weight of the user query training to be twice as important as the brand and category record fields.

- id: index-spelling
  params:
    text:
      bind: q
    phraseLabelWeights:
      const: query:1.0,brand:0.5,category:0.5

A weight can also be applied to the overall spelling model. The default and maximum weight is 1. The example below halves that by assigning 0.5. Assigning a lower weight can be beneficial if you find the suggestions that Spelling submits have too much impact on the search results.

The Relevance screen can be used to run test queries where you can assign the ideal weight according to your spelling model.

- id: index-spelling
  params:
    text:
      bind: q
    phraseLabelWeights:
      const: query:1.0,brand:0.5,category:0.5
    weight:
      constant: "0.5"

See Spellings suggestions

Now that you've configured your spelling model I imagine you want to see what suggestions it produces!

In the Relevance screen Advanced mode add export-phrases as a post step

postSteps:
- id: export-phrases

Underneath the search bar click the Raw tab and then submit a query in the search bar. In the example below we simulate a user misspelling 'winter titghts'.

We know that spelling has made a phrase suggestion due to phraseSuggestions being populated. In the phrases parameter we can see the users original query along with Spellings phrase suggestion are being submitted as the final search query.

"phraseSuggestions": "winter tights",
"phrases": "winter titghts:1,winter tights:0.55",

The absence of the phraseSuggestions parameter tells us that Spelling has made no phrase suggestions. The phrases parameter will contain individual word suggestions made by Spelling. Below is the output for a user query of 'long winter titghts' where no phrase suggestions are made.

 "phrases": "long winter titghts:1,long winter tights:0.55,long winter tight:0.46508819767420756,long water titghts:0.41181921888021766,long winner titghts:0.40503806100980266,long winter lights:0.4046347594911588"

Last updated