Bucket Aggregates

Buckets Aggregates scan over a result set to group and count results which meet specific conditions. Result buckets and counts are then returned with the query response. For example "buckets": "high:price >= 200,mid:price >=50 AND price < 200,low:price < 50" defines price conditions for high, mid and low price buckets to put aggregate counts into.

Before you start:

With this example the following bucket-aggregate steps must be defined in the query pipeline.

- id: bucket-aggregate
  params:
    buckets:
      bind: priceRangeBuckets
    name:
      constant: priceRange
- id: bucket-aggregate
  params:
    fields:
      bind: shippingOptionsBuckets
    name:
      constant: shippingOptions          

Search query: "q":"t-shirt"

The following bucket aggregates will be used in this example:

  1. Bucket priceRange (assuming price is a Float field)

  2. Bucket shippingOptions (assuming shippingOptions is a String field)

Description:

Everything is run using the global search query as the starting point, so in this example all counts include records which satisfy the global text query t-shirts. Note: the global text query can be empty, in which case all records are included.

To start we'll to construct a query to return a count of all the high, mid and low price options for the query t-shirts.

{
    "variables": {
        "q": "t-shirt",
        "priceRangeBuckets": "high:price >= 200,mid:price >=50 AND price < 200,low:price < 50"
    }
}

Example response

 "aggregates": {
        "PriceRange": {
            "buckets": {
                "buckets": {
                    "high": {
                        "name": "high",
                        "count": 5596
                    },
                    "low": {
                        "name": "low",
                        "count": 6233
                    },
                    "mid": {
                        "name": "mid",
                        "count": 9249
                    }
                }
            }
        }
 }

Adding a shipping options bucket aggregate

Now let's add the shipping bucket aggregate to the query to return a count of all the high, mid and low price options, and a count of the free shipping or fast despatch options for the query t-shirts.

{
    "variables": {
        "q": "t-shirt",
        "priceRangeBuckets": "high:price >= 200,mid:price >=50 AND price < 200,low:price < 50",
        "shippingOptionsBuckets": "freeshipping:shippingOptions ~ ['FreeShipping'],fastdispatch:shippingOptions ~ ['FastDispatch']"
    }
}

Example response

"aggregates": {
        "PriceRange": {
            "buckets": {
                "buckets": {
                    "high": {
                        "name": "high",
                        "count": 5596
                    },
                    "low": {
                        "name": "low",
                        "count": 6233
                    },
                    "mid": {
                        "name": "mid",
                        "count": 9249
                    }
                }
            }
        },
        "shippingOptions": {
            "buckets": {
                "buckets": {
                    "freeshipping": {
                        "name": "freeshipping",
                        "count": 19460
                    },
                    "fastdispatch": {
                        "name": "fastdispatch",
                        "count": 1000
                    }
                }
            }
        }
 }

Bucket Aggregate Filters:

A bucket aggregate filter is a pair, consisting of an bucket aggregate and a filter. Bucket Aggregate filters change the result set to reflect the filters that are specified with them.

Before you start:

With this example the following bucket-aggregate-filter steps must be defined in the query pipeline.

- id: bucket-aggregate-filter
  params:
    buckets:
      bind: priceRangeBucket
    filter:
      bind: priceRangeFilter
    name:
      constant: PriceRange
- id: bucket-aggregate-filter
  params:
    buckets:
      bind: shippingOptionsBucket
    filter:
      bind: shippingOptionsFilter
    name:
      constant: shippingOption

Continuing our Ecommerce search scenario

Filtering by price

If a user selects the low price filter, to only show t-shirts under $50. We want the result set to include products with q = "t-shirt" AND price < 50.

We also want the response to show counts for:

  • shipping options where: q = "t-shirt" AND price < 50

  • price where: q = "t-shirt"

In the query below we've added the filter "price < 50" to priceRangeFilter

{
    "variables": {
        "q": "t-shirt",
        "priceRangeBuckets": "high:price >= 200,mid:price >=50 AND price < 200,low:price < 50",
        "priceRangeFilter": "price < 50",
        "shippingOptionsBuckets": "freeshipping:shippingOptions ~ ['FreeShipping'],fastdispatch:shippingOptions ~ ['FastDispatch']"
    }
}

Example response

"aggregate_filters": {
        "PriceRange": {
            "buckets": {
                "buckets": {
                    "high": {
                        "name": "high",
                        "count": 5596
                    },
                    "low": {
                        "name": "low",
                        "count": 6233
                    },
                    "mid": {
                        "name": "mid",
                        "count": 9249
                    }
                }
            }
        },
        "shippingOptions": {
            "buckets": {
                "buckets": {
                    "freeshipping": {
                        "name": "freeshipping",
                        "count": 8545
                    },
                    "fastdispatch": {
                        "name": "fastdispatch",
                        "count": 635
                    }
                }
            }
        }
    },

Filtering by shipping

If a user selects the low price filter and free shipping, to only show t-shirts under $50 with free shipping. We want the result set to include products with q = "t-shirt" AND price < 50 AND shippingOptions ~ ['FreeShipping'].

We also want the response to show counts for:

  • shipping options where: q = "t-shirt" AND price < 50

  • price where: q = "t-shirt" AND shippingOptions ~ ['FreeShipping']

In the query below we've added the filter "shippingOptions ~ ['FreeShipping']" to shippingOptionsFilter

{
    "variables": {
        "q": "t-shirt",
        "priceRangeBuckets": "high:price >= 200,mid:price >=50 AND price < 200,low:price < 50",
        "priceRangeFilter": "price < 50",
        "shippingOptionsBuckets": "freeshipping:shippingOptions ~ ['FreeShipping'],fastdispatch:shippingOptions ~ ['FastDispatch']",
        "shippingOptionsFilter": "shippingOptions ~ ['FreeShipping']"
    }
}

Example response

"aggregate_filters": {
        "PriceRange": {
            "buckets": {
                "buckets": {
                    "high": {
                        "name": "high",
                        "count": 5565
                    },
                    "low": {
                        "name": "low",
                        "count": 3456
                    },
                    "mid": {
                        "name": "mid",
                        "count": 5650
                    }
                }
            }
        },
        "shippingOptions": {
            "buckets": {
                "buckets": {
                    "freeshipping": {
                        "name": "freeshipping",
                        "count": 8545
                    },
                    "fastdispatch": {
                        "name": "fastdispatch",
                        "count": 635
                    }
                }
            }
        }
    },

Other techniques:

There may be scenarios where you want a record to be counted by more than one bucket. For example a customer rating bucket aggregation where you want a rating of 3.1 to be counted by the over 1 rating, over 2 rating and over 3 rating buckets. The solution to this example is to reverse the order of the buckets as you define them like the example below

{
    "variables": {
        "q": "t-shirt",
        "customerRatingBucket": "5+:(rating=5),4+:(rating>=4),3+:(rating>=3),2+:(rating>=2),1+:(rating>=1)"
    }
}

Last updated