Search code examples
hbaseapache-phoenix

How does Phoenix translate IN clause?


Are there performance differences in Apache Phoenix between doing a query like: SELECT * FROM Table WHERE Column IN ('A') and SELECT * FROM Table WHERE Column = 'A'?

Same question between: SELECT * FROM Table WHERE Column IN ('A','B') and SELECT * FROM Table WHERE Column = 'A' OR Column = 'B'


Solution

  • In case of:
    SELECT * FROM Table WHERE Column IN ('A','B')
    It uses java Set under the hoot. So the time complexity is O(1). See InListExpression for more details.

    In case of:
    SELECT * FROM Table WHERE Column = 'A' OR Column = 'B'
    It uses linear comparison under the hood. So the time complexity if O(n). See OrExpression for more details.

    So yes, it seems like there is a significant performance difference between these two statements if we speak about decent amount of data.