Search code examples
redisredisearch

Having trouble getting aggregations working.


I'm in the midst of exploring RediSearch and I thought I'd give the aggregations feature a shot and have hit a roadblock.

I can't seem to get a good result.

For testing purposes I created a basic index/schema like so:

FT.CREATE test SCHEMA field TEXT

FT.ADD test 1A 1 FIELDS field hello
FT.ADD test 2A 1 FIELDS field hello
FT.ADD test 3A 1 FIELDS field hello
FT.ADD test 4A 1 FIELDS field world

Next, I issued the following query:

FT.AGGREGATE test "*" GROUPBY 1 @field REDUCE COUNT 0 AS agg

My expectation was that I get a result indicating hello occurs three times and world occurs once... but instead I get the following result:

1) (integer) 1
2) 1) "field"
   2) (nil)
   3) "agg"
   4) "4"

I thought it was pretty straight forward... but I'm obviously doing something wrong.

Also, the following is the output from the MODULE LIST command:

1) 1) "name"
   2) "ft"
   3) "ver"
   4) (integer) 10300
2) 1) "name"
   2) "ReJSON"
   3) "ver"
   4) (integer) 10001

Any help would be super.

Thanks!


Solution

  • It turns out that I should have read the documentation better.

    From the section in the aggregations documentation where they describe the FT.AGGREGATE command parameters they mention LOAD {nargs} {property}, where they say:

    Load document fields from the document HASH objects. This should be avoided as a general rule of thumb. Fields needed for aggregations should be stored as SORTABLE, where they are available to the aggregation pipeline with very low latency. LOAD hurts the performance of aggregate queries considerably since every processed record needs to execute the equivalent of HMGET against a redis key, which when executed over millions of keys, amounts to very high processing times.

    From the query example in the original question I had:

    FT.AGGREGATE test "*" GROUPBY 1 @field REDUCE COUNT 0 AS agg

    Since the schema definition didn't have field defined as SORTABLE I would have to LOAD "field" in order to perform an aggregation on it.

    FT.AGGREGATE test "*" LOAD 1 @field GROUPBY 1 @field REDUCE COUNT 0 AS agg

    However, since according to the documentation LOAD hurts performance I should have instead defined the field I want to aggregate as SORTABLE.

    FT.CREATE test SCHEMA field TEXT SORTABLE

    With the schema properly defined my original query works.