Search code examples
ignite

Apache Ignite join bug?


I have an issue with a join SQL query in Apache Ignite. Is this a bug or I missed something?

The following query unexpectedly doesn't return any result:

 SELECT * FROM PERSON AS p, ACCOUNT AS a
 WHERE 
 p.ACCOUNT_ID = a.id
 AND 
 p.ID = '4nFwLM9FRxKZJb4V9vtdlw'

When I include a condition for the account, the row is returned:

 SELECT * FROM PERSON AS p, ACCOUNT AS a
 WHERE 
 p.ACCOUNT_ID = a.id
 AND 
 p.ID = '4nFwLM9FRxKZJb4V9vtdlw'
 AND a.id = 'wZvcAnbBSpOps9oteH-Oxw'

Schema:

CREATE TABLE umm_person (
    id varchar(36) NOT NULL primary key,
    account_id varchar(36) NOT NULL
);
CREATE INDEX idx_person_account_id ON umm_person (account_id);

CREATE TABLE main_account (
    id varchar(36) NOT NULL primary key,
);

I tested with Apache Ignite 2.14.0 and 2.10.0, using a single node:

public class IgniteNode {

    public static void main(String[] args) throws Exception {
        IgniteConfiguration cfg = new IgniteConfiguration();
        cfg.setClientMode(false);
        cfg.setPeerClassLoadingEnabled(true);
        cfg.setCacheConfiguration();
        Ignition.start(cfg);
    }
} 

Solution

  • This behavior is described in some detail in this issue's comments.

    In short, your data isn't colocated.

    For JOIN to work, one of the three must be true:

    • Both tables are PARTITIONED, and the JOIN condition is the affinity key of both of them.
    • At least one of the tables is REPLICATED.
    • distributedJoins=true is set.

    You might think that on a single node cluster, all data is colocated anyway but it is not actually true. And depending on the query, the SQL engine may change how to search the data - that won't affect the result when everything is configured correctly but leads to different results in your setup when you add another condition.