Search code examples
jdodatanucleus

Queries return duplicates in JDO/Datanucleus/H2


I'm adding 2 object to a database, a Person and a Student (subclass of Person). When I query on Person, it returns each instance e twice. When I query on Student, it return both instances, even though Person is not a sub-class of Student. The code is based on the jdo-test-template from datanucleus. I'm using Datanucleus 5.0.0m1.

tx.begin();

Person p = new Person(0, "Pete");
Student s = new Student(1, "Sarah");
pm.makePersistent(p);
pm.makePersistent(s);

Query<Person> qP = pm.newQuery(Person.class);
Collection<Person>cP = (Collection<Person>) qP.execute();
for (Person p2: cP) {
    System.out.println("Person: " + p2.getName() + " " + p2.getId() + " " + System.identityHashCode(p2));
}

Query<Student> qS = pm.newQuery(Student.class);
Collection<Student>c = (Collection<Student>) qS.execute();
for (Student s2: c) {
    System.out.println("Student: " + s2.getName() + " " + s2.getId() + " " + System.identityHashCode(s2));
}
tx.commit();

The Person class is unchanged from the example template:

@PersistenceCapable(detachable="true")
public class Person {
    @PrimaryKey
    Long id;
    String name;

    public Person(long id, String name) {
        this.id = id;
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public Long getId() {
        return id;
    }
}

The Student class:

@PersistenceCapable(detachable="true")
public class Student extends Person {

    public Student(long id, String name) {
        super(id, name);
    }
}

I also added Student to the persistence.xml file:

<?xml version="1.0" encoding="UTF-8" ?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd"
    version="1.0">

    <persistence-unit name="MyTest">
        <!-- Add all of your model classes here -->
        <class>mydomain.model.Person</class>
        <class>mydomain.model.Student</class>
        <exclude-unlisted-classes />
        <properties>
            <!-- Update these datastore details if different -->
            <property name="javax.jdo.PersistenceManagerFactoryClass" value="org.datanucleus.api.jdo.JDOPersistenceManagerFactory"/>
            <property name="javax.jdo.option.ConnectionURL" value="jdbc:h2:mem:nucleus"/>
            <property name="javax.jdo.option.ConnectionDriverName" value="org.h2.Driver"/>
            <property name="javax.jdo.option.ConnectionUserName" value="sa"/>
            <property name="javax.jdo.option.ConnectionPassword" value=""/>

            <property name="datanucleus.schema.autoCreateAll" value="true"/>
        </properties>
    </persistence-unit>

</persistence>

When running the program, I get the following output:

Person: Sarah 1 454305524
Person: Sarah 1 1536471117
Person: Pete 0 1961945640
Person: Pete 0 1898155970
Student: Pete 0 1898155970
Student: Sarah 1 1536471117

Looking at the System.identityHashCode(...), it is returning 4 distinct Java instances for the first query. Am I doing anything wrong? Or is the output expected?

EDIT I just confirmed that DataNucleus 4.1.8 behaves the same as 5.0.0m1

EDIT

From the logfile:

17:36:21,660 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compiling "SELECT FROM mydomain.model.Person"
17:36:21,668 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compile Time = 8 ms
17:36:21,668 (main) DEBUG [DataNucleus.Query] - QueryCompilation:
  [symbols: this type=mydomain.model.Person]
17:36:21,669 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compiling "SELECT FROM mydomain.model.Person" for datastore
17:36:21,697 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compile Time for datastore = 28 ms
17:36:21,698 (main) DEBUG [DataNucleus.Query] - SELECT FROM mydomain.model.Person Query compiled to datastore query "SELECT 'mydomain.model.Person ' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0 UNION SELECT 'mydomain.model.Student' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0"
17:36:21,698 (main) DEBUG [DataNucleus.Persistence] - ExecutionContext.internalFlush() process started using ordered flush - 2 enlisted objects
17:36:21,698 (main) DEBUG [DataNucleus.Persistence] - ExecutionContext.internalFlush() process finished
17:36:21,698 (main) DEBUG [DataNucleus.Connection] - ManagedConnection found in the pool : "org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl@3c41ed1d [conn=org.datanucleus.store.rdbms.datasource.dbcp.PoolingDataSource$PoolGuardConnectionWrapper@22ff4249, commitOnRelease=false, closeOnRelease=false, closeOnTxnEnd=true]" for key="org.datanucleus.ExecutionContextImpl@40ef3420" in factory="ConnectionFactory:tx[org.datanucleus.store.rdbms.ConnectionFactoryImpl@5b12b668]"
17:36:21,698 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Executing "SELECT FROM mydomain.model.Person" ...
17:36:21,698 (main) DEBUG [DataNucleus.Datastore.Native] - BATCH [INSERT INTO PERSON ("NAME",ID) VALUES (<'Pete'>,<0>); INSERT INTO PERSON ("NAME",ID) VALUES (<'Sarah'>,<1>)]
17:36:21,699 (main) DEBUG [DataNucleus.Datastore] - Execution Time = 1 ms (number of rows = [1, 1]) on PreparedStatement "org.datanucleus.store.rdbms.ParamLoggingPreparedStatement@1573f9fc"
17:36:21,700 (main) DEBUG [DataNucleus.Datastore] - Closing PreparedStatement "org.datanucleus.store.rdbms.datasource.dbcp.DelegatingPreparedStatement@5939a379"
17:36:21,701 (main) DEBUG [DataNucleus.Datastore.Native] - SELECT 'mydomain.model.Person ' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0 UNION SELECT 'mydomain.model.Student' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0
17:36:21,702 (main) DEBUG [DataNucleus.Datastore.Retrieve] - Execution Time = 1 ms
17:36:21,705 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Execution Time = 7 ms
17:36:21,707 (main) DEBUG [DataNucleus.Cache] - Object with id "mydomain.model.Person:1" not found in Level 1 cache [cache size = 2]
17:36:21,707 (main) DEBUG [DataNucleus.Cache] - Object with id "mydomain.model.Person:1" not found in Level 2 cache
17:36:21,708 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Person@1f97cf0d" (id="mydomain.model.Person:1") added to Level 1 cache (loadedFlags="[YN]")
17:36:21,709 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Person@1f97cf0d" (id="1") added to Level 2 cache (fields="[0, 1]", version="")
17:36:21,711 (main) DEBUG [DataNucleus.Lifecycle] - Object "mydomain.model.Person@1f97cf0d" (id="mydomain.model.Person:1") has a lifecycle change : "HOLLOW"->"P_CLEAN"
17:36:21,711 (main) DEBUG [DataNucleus.Transaction] - Object "mydomain.model.Person@1f97cf0d" (id="1") enlisted in transactional cache
17:36:21,712 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Student@477b4cdf" (id="mydomain.model.Student:1") taken from Level 1 cache (loadedFlags="[YY]") [cache size = 3]
17:36:21,712 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Person@8dbdac1" (id="mydomain.model.Person:0") taken from Level 1 cache (loadedFlags="[YY]") [cache size = 3]
17:36:21,712 (main) DEBUG [DataNucleus.Cache] - Object with id "mydomain.model.Student:0" not found in Level 1 cache [cache size = 3]
17:36:21,713 (main) DEBUG [DataNucleus.Cache] - Object with id "mydomain.model.Student:0" not found in Level 2 cache
17:36:21,713 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Student@49dc7102" (id="mydomain.model.Student:0") added to Level 1 cache (loadedFlags="[YN]")
17:36:21,713 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Student@49dc7102" (id="0") added to Level 2 cache (fields="[0, 1]", version="")
17:36:21,713 (main) DEBUG [DataNucleus.Lifecycle] - Object "mydomain.model.Student@49dc7102" (id="mydomain.model.Student:0") has a lifecycle change : "HOLLOW"->"P_CLEAN"
17:36:21,713 (main) DEBUG [DataNucleus.Transaction] - Object "mydomain.model.Student@49dc7102" (id="0") enlisted in transactional cache
17:36:21,713 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compiling "SELECT FROM mydomain.model.Student"
17:36:21,713 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compile Time = 0 ms
17:36:21,713 (main) DEBUG [DataNucleus.Query] - QueryCompilation:
  [symbols: this type=mydomain.model.Student]
17:36:21,713 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compiling "SELECT FROM mydomain.model.Student" for datastore
17:36:21,714 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Compile Time for datastore = 1 ms
17:36:21,714 (main) DEBUG [DataNucleus.Query] - SELECT FROM mydomain.model.Student Query compiled to datastore query "SELECT 'mydomain.model.Student' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0"
17:36:21,714 (main) DEBUG [DataNucleus.Connection] - ManagedConnection found in the pool : "org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl@3c41ed1d [conn=org.datanucleus.store.rdbms.datasource.dbcp.PoolingDataSource$PoolGuardConnectionWrapper@22ff4249, commitOnRelease=false, closeOnRelease=false, closeOnTxnEnd=true]" for key="org.datanucleus.ExecutionContextImpl@40ef3420" in factory="ConnectionFactory:tx[org.datanucleus.store.rdbms.ConnectionFactoryImpl@5b12b668]"
17:36:21,714 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Executing "SELECT FROM mydomain.model.Student" ...
17:36:21,714 (main) DEBUG [DataNucleus.Datastore] - Closing PreparedStatement "org.datanucleus.store.rdbms.datasource.dbcp.DelegatingPreparedStatement@6b8ca3c8"
17:36:21,715 (main) DEBUG [DataNucleus.Datastore.Native] - SELECT 'mydomain.model.Student' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0
17:36:21,715 (main) DEBUG [DataNucleus.Datastore.Retrieve] - Execution Time = 1 ms
17:36:21,715 (main) DEBUG [DataNucleus.Query] - JDOQL Query : Execution Time = 1 ms
17:36:21,715 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Student@49dc7102" (id="mydomain.model.Student:0") taken from Level 1 cache (loadedFlags="[YY]") [cache size = 4]
17:36:21,715 (main) DEBUG [DataNucleus.Cache] - Object "mydomain.model.Student@477b4cdf" (id="mydomain.model.Student:1") taken from Level 1 cache (loadedFlags="[YY]") [cache size = 4]

I think it is interesting that the Student query is executed on the PERSON table without further parameters. Not sure how they map inheritance in the database, but if all classes are mapped into a single table then I would at least expect one column with the type identifier:

    17:36:21,715 (main) DEBUG [DataNucleus.Datastore.Native] - SELECT 'mydomain.model.Student' AS NUCLEUS_TYPE,A0.ID,A0."NAME" FROM PERSON A0

Solution

  • You haven't specified the inheritance strategy and how you want your persistence mechanism to distinguish between classes sharing a table. Use @Inheritance and @Discriminator as per these docs. While it will default the inheritance strategy to be NEW_TABLE for the base class and SUPERCLASS_TABLE for the sub class, it will NOT default any discriminator because maybe you didn't want to have one and never have a need for separating what is stored in that table