Search code examples
google-cloud-spanner

Can I index EXTRACT(WEEK from startDateTime)? Or, will the query planner use an index directly on 'startDateTime'?


I have a large number of records indexed on some startDateTime field, and want to select aggregates (SUM and COUNT) on all records grouped by WEEKOFYEAR(startDateTime) (i.e., EXTRACT(WEEK FROM startDateTime)). Can I put a secondary index on EXTRACT(WEEK FROM startDateTime)? Or, even better, will the query use an index on startDateTime appropriately to optimize a request grouped by WEEK?

See this similar question about MySQL indices. How would this be handled in the Cloud Spanner world?


Solution

  • Secondary index on generated columns (i.e., EXTRACT(WEEK FROM startDateTime)) are not supported yet. If you have a covering index that includes all the columns required for the query (i.e., startDateTime and other required columns for grouping and aggregation), the planner will use such covering index over the base table but the aggregation is likely to be based on hash aggregation. Unless you aggregate over very long period of time, it should not be a big problem (I admit that it is not ideal though).

    If you want to restrict the aggregated time range, you need to spell it out in terms of startDateTime (i.e., you need to convert the min/max datetime to the same type as startDateTime).

    Hope this helps.