Filter expressions

Filter expressions are a crucial concept in Sajari. Not only are they used as part of the search query to narrow the results. They are also used to boost specific results that match a filter expression or execute certain boosts only if a filter condition matches.

Structure of a filter expression

The most basic filter expression consists of 3 parts.

  • Field/Parameter

  • Operator

  • Value

Example

The following expression matches if:

  • search query (parameter = q )

  • contains (operator = ~)

  • the word (value = star )

Fields and parameters

Fields refer to the schema fields available on the record. For example title, description, brand, category, or price .

Parameters refers to the parameters passed via the search query. This could be the query itself q, applied filters (like brand or price), or personalisation information like location, gender, or membership.

Whether a field or a parameter (or both) are available in the filter expression depends on the context. Conditions, which specify whether a certain boost is applied, can only act on query parameters. Boost filters on the other hand can refer to both, fields on the record as well as parameters passed into the query.

Operators

Filters expressions can utilise a powerful set of operators. The operators available depend on the type of field/parameter. For example Contains (~) can operate on text fields, where Greater Than (>) can only operate on numeric fields.

When using advanced editing, all values must be enclosed in single quotation marks, i.e. "field boost must be greater than 10" is written as boost>'10'.

Below is a full list of available operators.

OperatorDescriptionExample

Equal To (=)

Field is equal to a value (numeric or string)

dir1='blog'

Not Equal To (!=)

Field is not equal to a value (numeric or string)

dir1!='blog'

Greater Than (>)

Field is greater than a numeric value

boost>'10'

Greater Than Or Equal To (>=)

Field is greater than or equal to a numeric value

boost>='10'

Less Than (<)

Field is less than a given numeric value

boost<'50'

Less Than Or Equal To (<=)

Field is less than or equal to a given numeric value

boost<'50'

Begins With (^)

Field begins with a string

dir1^'bl'

Ends With ($)

Field ends with a string

dir1$'og'

Contains (~)

Field contains a string

dir1~'blog'

Does Not Contain (!~)

Field does not contain a string

dir1!~'blog'

Filtering arrays

The Contains (~) and Does Not Contain (!~) operators can be used to filter values in an array. The following example shows a filter that returns all records with the colour red stored stored in a color array field.

FieldArrayValuesExample

color

Yes

red, blue, white

color ~ ['red']

Combining expressions

It's also possible to build more complex filters by combining field filter expressions with AND/OR operators, and brackets.

OperatorDescriptionExample

AND

Both expressions must match

dir1='blog' AND domain='www.search.io'

OR

One expression must match

dir1='blog' OR domain='blog.search.io'

For example, to match pages with language set to en on www.search.io or any page within the en.search.io domain:

(domain='www.search.io' AND lang='en') OR domain='en.search.io'

Filter functions

Some filters are difficult to express in boolean logic. For these there are filter functions that are utilised to create filters for you. They can also be part of larger boolean expressions.

Checking for existence or non-existence

IS_NULL(field)

Returns TRUE if field is NULL

IS_NOT_NULL(field)

Returns TRUE if field is NOT NULL

Filtering with multiple values

The NOT IN/IN function is shorthand for multiple OR conditions.

field IN ('value1', 'value2', 'value3')

Returns TRUE if the field value on a record is equal to value1 , value2 or value3 or any other additional values defined within a list.

field NOT IN ('value1', 'value2', 'value3')

Returns TRUE if the field on a record is not equal to value1 , value2 or value3 or any other additional values defined within a list.

The NOT IN/IN function works only on single value fields.

Filtering based on geo distance

GEO_INSIDE(latitude, longitude, lat_var, lng_var, radius)

Returns TRUE if the input geopoint lat_var, lng_var is within the haversine radius (in kilometres) of the latitude, longitude geopoint on the record.

Note: there is also a geo_boost step that can be used to boost results based on their geo distance as opposed to filtering as per above.

Time based filtering

SINCE_NOW(field, duration)

Returns TRUE if the timestamp field is equal to or greater than the current time (UTC) minus the defined duration.

A duration string is a signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns" (nanoseconds), "us" or "ยตs" (microsecond), "ms" (millisecond), "s" (second), "m" (minute), "h" (hour).

For example, to filter records with a timestamp field greater than yesterday, use SINCE_NOW(field, '24h'). This can be beneficial to filter records based on recency such as published date.

Handling sub variants

ARRAY_MATCH(expression)

Returns TRUE for records where repeated fields have an offset matching the expression.

For example, if a record had 3 variants of a product with varying price, color and size. These variants could be stored in three array based fields and the ARRAY_MATCH() filter function could be used to evaluate each offset in these fields collectively as if they were singular. To illustrate this, if we had the following product indexed:

FieldArrayValues

title

No

Air jordan shoes

color

Yes

red, red, white

size

Yes

13, 14, 13

price

Yes

122.00, 122.00, 130.00

If we were to search using the filter function ARRAY_MATCH(size = '14' AND color = 'white'), this would look at each offset in the size and color arrays respectively and evaluate the values like they were singular. In this case each of the offsets does not match the filter and FALSE would be returned for this function when evaluating this record.

This filter function is a powerful way to handle variants. Some common examples are price variations (i.e. volume based price breaks) and automotive parts (parts can match many makes and models).

Last updated