# Filter expressions

Filter expressions are a crucial concept in Sajari. Not only are they used as part of the search query to narrow the results. They are also used to boost specific results that match a filter expression or execute certain boosts only if a filter condition matches.

## Structure of a filter expression

The most basic filter expression consists of 3 parts.

* Field/Parameter
* Operator
* Value

#### Example

The following expression matches if:

* search query (parameter = `q` )
* contains (operator = `~`)
* the word (value = `star` )

![](https://3882858970-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVeIbtsTcQaqaNeKzLbbU%2Fuploads%2FLodCAkotXBTVYokZrqBZ%2FFilter%20Structure.png?alt=media\&token=a2bea9f8-e129-4c5f-ab95-fd97ada8e794)

### Fields and parameters

**Fields** refer to the schema fields available on the record. For example `title`, `description`, `brand`, `category`, or `price` .

**Parameters** refers to the parameters passed via the search query. This could be the query itself `q`, applied filters (like `brand` or `price`), or personalisation information like `location`, `gender`, or `membership`.

Whether a field or a parameter (or both) are available in the filter expression depends on the context. **Conditions**, which specify whether a certain boost is applied, can only act on query parameters. **Boost** filters on the other hand can refer to both, fields on the record as well as parameters passed into the query.

### Operators

Filters expressions can utilise a powerful set of operators. The operators available depend on the type of field/parameter. For example Contains (`~`) can operate on text fields, where Greater Than (`>`) can only operate on numeric fields.

{% hint style="warning" %}
When using advanced editing, all values must be enclosed in single quotation marks, i.e. "field *boost* must be greater than 10" is written as `boost>'10'`.
{% endhint %}

Below is a full list of available operators.

| Operator                        | Description                                            | Example        |
| ------------------------------- | ------------------------------------------------------ | -------------- |
| Equal To (`=`)                  | Field is equal to a value (*numeric* or *string*)      | `dir1='blog'`  |
| Not Equal To (`!=`)             | Field is not equal to a value (*numeric* or *string*)  | `dir1!='blog'` |
| Greater Than (`>`)              | Field is greater than a *numeric* value                | `boost>'10'`   |
| Greater Than Or Equal To (`>=`) | Field is greater than or equal to a *numeric* value    | `boost>='10'`  |
| Less Than (`<`)                 | Field is less than a given *numeric* value             | `boost<'50'`   |
| Less Than Or Equal To (`<=`)    | Field is less than or equal to a given *numeric* value | `boost<'50'`   |
| Begins With (`^`)               | Field begins with a *string*                           | `dir1^'bl'`    |
| Ends With (`$`)                 | Field ends with a *string*                             | `dir1$'og'`    |
| Contains (`~`)                  | Field contains a *string*                              | `dir1~'blog'`  |
| Does Not Contain (`!~`)         | Field does not contain a *string*                      | `dir1!~'blog'` |

#### **Filtering arrays**

The Contains (`~`) and Does Not Contain (`!~`) operators can be used to filter values in an array. The following example shows a filter that returns all records with the colour `red` stored stored in a color array field.

| Field | Array | Values           | Example           |
| ----- | ----- | ---------------- | ----------------- |
| color | Yes   | red, blue, white | `color ~ ['red']` |

### **Combining expressions**

It's also possible to build more complex filters by combining field filter expressions with `AND`/`OR` operators, and brackets.

| Operator | Description                 | Example                                  |
| -------- | --------------------------- | ---------------------------------------- |
| `AND`    | Both expressions must match | `dir1='blog' AND domain='www.search.io'` |
| `OR`     | One expression must match   | `dir1='blog' OR domain='blog.search.io'` |

For example, to match pages with language set to `en` on `www.search.io` or any page within the `en.search.io` domain:

```
(domain='www.search.io' AND lang='en') OR domain='en.search.io'
```

## Filter functions

Some filters are difficult to express in boolean logic. For these there are filter functions that are utilised to create filters for you. They can also be part of larger boolean expressions.

### **Checking for existence or non-existence**

```yaml
IS_NULL(field)
```

Returns `TRUE` if `field` is NULL

```yaml
IS_NOT_NULL(field)
```

Returns `TRUE` if `field` is NOT NULL

### **Filtering with multiple values**

The `NOT IN/IN` function is shorthand for multiple `OR` conditions.

```
field IN ('value1', 'value2', 'value3')
```

Returns `TRUE` if the `field` value on a record is equal to `value1` , `value2` or `value3` or any other additional values defined within a list.

```
field NOT IN ('value1', 'value2', 'value3')
```

Returns `TRUE` if the `field` on a record *is not* equal to `value1` , `value2` or `value3` or any other additional values defined within a list.

{% hint style="info" %}
The `NOT IN/IN` function works only on single value fields.
{% endhint %}

### **Filtering based on geo distance**

```yaml
GEO_INSIDE(latitude, longitude, lat_var, lng_var, radius)
```

Returns `TRUE` if the input geopoint `lat_var, lng_var` is within the haversine radius (in kilometres) of the `latitude, longitude` geopoint on the record.

Note: there is also a `geo_boost` step that can be used to boost results based on their geo distance as opposed to filtering as per above.

### **Time based filtering**

```yaml
SINCE_NOW(field, duration)
```

Returns `TRUE` if the timestamp `field` is equal to or greater than the current time (UTC) minus the defined `duration`.

A duration string is a signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns" (nanoseconds), "us" or "µs" (microsecond), "ms" (millisecond), "s" (second), "m" (minute), "h" (hour).

For example, to filter records with a timestamp `field` greater than yesterday, use `SINCE_NOW(field, '24h')`. This can be beneficial to filter records based on recency such as published date.

### **Handling sub variants**

```yaml
ARRAY_MATCH(expression)
```

Returns `TRUE` for records where repeated fields have an offset matching the `expression`.

For example, if a record had 3 variants of a product with varying price, color and size. These variants could be stored in three array based fields and the `ARRAY_MATCH()` filter function could be used to evaluate each offset in these fields collectively as if they were singular. To illustrate this, if we had the following product indexed:

| Field | Array | Values                 |
| ----- | ----- | ---------------------- |
| title | No    | Air jordan shoes       |
| color | Yes   | red, red, white        |
| size  | Yes   | 13, 14, 13             |
| price | Yes   | 122.00, 122.00, 130.00 |

If we were to search using the filter function `ARRAY_MATCH(size = '14' AND color = 'white')`, this would look at each offset in the `size` and `color` arrays respectively and evaluate the values like they were singular. In this case each of the offsets does not match the filter and `FALSE` would be returned for this function when evaluating this record.

This filter function is a powerful way to handle variants. Some common examples are price variations (i.e. volume based price breaks) and automotive parts (parts can match many makes and models).
