Search code examples
graphdb

Why is GraphDB's Preload Failing on "Queue file mismatch"?


I have a knowledge graph with an estimated 20 billion statements. When using GraphDB Enterprise with the Preload tool, several errors appear and I'm wondering why and how to complete the preload. The first error is in reference to a "Queue file mismatch". Next, the .

Ontotext writes sensitive data to logfiles (I've seen license keys in plaintext), so I'm hesitant to post the full output here.

GraphDB Version: 10.6.2

07:14:27.553 [repository-manager-0] INFO com.ontotext.trree.sdk.impl.PluginManager - Finished initializing plugins
07:14:27.558 [repository-manager-0] INFO com.ontotext.trree.MemoryConfig - Configured global page cache size to 16 MB
07:14:27.558 [repository-manager-0] INFO com.ontotext.trree.MemoryConfig - Max Heap Size: 128 GB
07:14:27.558 [repository-manager-0] INFO com.ontotext.trree.MemoryConfig - Initial Heap Size: 128 GB
07:14:28.142 [repository-manager-0] DEBUG com.ontotext.trree.OwlimSchemaRepository - Restored from persistence: true
07:14:28.162 [repository-manager-0] DEBUG com.ontotext.config.AbstractParameter - Configured parameter 'do.resolve.entities' to default value 'true'
07:14:28.163 [repository-manager-0] DEBUG com.ontotext.config.AbstractParameter - Configured parameter 'do.load.data' to default value 'true'
07:14:28.163 [repository-manager-0] INFO com.ontotext.rio.parallel.ParallelLoader - Data will be parsed + resolved + loaded.
07:14:28.163 [repository-manager-0] DEBUG com.ontotext.config.AbstractParameter - Configured parameter 'graphdb.persistent.parallel.inferencers' to default value 'false'
07:14:28.283 [repository-manager-0] DEBUG com.ontotext.license.b - Decrypting license
07:14:28.284 [repository-manager-0] DEBUG com.ontotext.license.b - Extracting license properties
07:14:28.284 [repository-manager-0] DEBUG com.ontotext.license.b - Extracted license properties: {owlim.license.licensee=UCSB_COVID19, owlim.license.maxNumCpuCores=16, owlim.license.typeOfUse=Non-commercial research only, owlim.license.product=GRAPHDB_ENTERPRISE, owlim.license.expiryDate=02-04-2025}
07:14:28.342 [repository-manager-0] DEBUG com.ontotext.trree.entitypool.impl.map.PersistedHashMap$MapTransactionUnit$1 - keyIndexSize=1000
07:14:28.356 [repository-manager-0] INFO com.ontotext.rio.parallel.ParallelLoader - Using 16 threads for inference
07:14:29.397 [main] INFO com.ontotext.graphdb.importrdf.Preload - Restore point detected
07:14:30.558 [main] ERROR com.ontotext.graphdb.importrdf.Preload - Could not initialize from a recovery point. restarting!
java.io.IOException: Queue file mismatch:/tmp/pos_q
    at com.ontotext.trree.util.CompressedSortedChunksFileQueue.<init>(CompressedSortedChunksFileQueue.java:345)
    at com.ontotext.graphdb.importrdf.RestoreManager.readQueueState(RestoreManager.java:756)
    at com.ontotext.graphdb.importrdf.RestoreManager.initFromPhaseTwoRestorePoint(RestoreManager.java:145)
    at com.ontotext.graphdb.importrdf.RestoreManager.initFromRestorePoint(RestoreManager.java:132)
    at com.ontotext.graphdb.importrdf.Preload.init(Preload.java:739)
    at com.ontotext.graphdb.importrdf.Preload.mainPreloadInternal(Preload.java:282)
    at com.ontotext.graphdb.importrdf.BaseLoadTool.mainInternal(BaseLoadTool.java:203)
    at com.ontotext.graphdb.importrdf.Preload.call(Preload.java:254)
    at com.ontotext.graphdb.importrdf.Preload.call(Preload.java:55)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
    at picocli.CommandLine.access$1300(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
    at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
    at picocli.CommandLine.execute(CommandLine.java:2078)
    at com.ontotext.graphdb.importrdf.ImportRDF.main(ImportRDF.java:31)
07:14:35.557 [main] INFO com.ontotext.trree.sdk.impl.PluginManager - Shutting down plugins (DEFAULT)...
07:14:35.557 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.lang.invoke.LambdaMetafactory resolve = false
07:14:35.557 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.lang.invoke.LambdaMetafactory in plugin classpath
07:14:35.557 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.lang.invoke.LambdaMetafactory not found in plugin classpath. Delegating to parent classloader
07:14:35.558 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.util.function.BiConsumer resolve = false
07:14:35.558 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.util.function.BiConsumer in plugin classpath
07:14:35.558 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.util.function.BiConsumer not found in plugin classpath. Delegating to parent classloader
07:14:35.566 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence resolve = false
07:14:35.566 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence in plugin classpath
07:14:35.601 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence$RepositorySupplier resolve = false
07:14:35.602 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence$RepositorySupplier in plugin classpath
07:14:35.602 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence$CachedRepositorySupplier resolve = false
07:14:35.602 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : com.ontotext.trree.plugin.externalsync.impl.entitychange.EntityChangePersistence$CachedRepositorySupplier in plugin classpath
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : org.eclipse.rdf4j.sail.Sail resolve = false
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : org.eclipse.rdf4j.sail.Sail in plugin classpath
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : org.eclipse.rdf4j.sail.Sail not found in plugin classpath. Delegating to parent classloader
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : org.eclipse.rdf4j.repository.Repository resolve = false
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : org.eclipse.rdf4j.repository.Repository in plugin classpath
07:14:35.603 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : org.eclipse.rdf4j.repository.Repository not found in plugin classpath. Delegating to parent classloader
07:14:35.797 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.lang.invoke.LambdaMetafactory resolve = false
07:14:35.797 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.lang.invoke.LambdaMetafactory in plugin classpath
07:14:35.798 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.lang.invoke.LambdaMetafactory not found in plugin classpath. Delegating to parent classloader
07:14:35.798 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : com.ontotext.trree.plugin.externalsync.impl.kafka.KafkaProducerRegistry$ReferencedProducer resolve = false
07:14:35.798 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : com.ontotext.trree.plugin.externalsync.impl.kafka.KafkaProducerRegistry$ReferencedProducer in plugin classpath
07:14:35.819 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.util.function.Predicate resolve = false
07:14:35.819 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.util.function.Predicate in plugin classpath
07:14:35.819 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.util.function.Predicate not found in plugin classpath. Delegating to parent classloader
07:14:36.063 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.util.Collection resolve = false
07:14:36.063 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.util.Collection in plugin classpath
07:14:36.064 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.util.Collection not found in plugin classpath. Delegating to parent classloader
07:14:36.064 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.util.Iterator resolve = false
07:14:36.064 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.util.Iterator in plugin classpath
07:14:36.065 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.util.Iterator not found in plugin classpath. Delegating to parent classloader
07:14:36.065 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Received request to load class : java.util.concurrent.ExecutorService resolve = false
07:14:36.065 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Trying to find class : java.util.concurrent.ExecutorService in plugin classpath
07:14:36.066 [main] DEBUG com.ontotext.trree.sdk.impl.ServiceLocator$ExternalPluginLoader - Class : java.util.concurrent.ExecutorService not found in plugin classpath. Delegating to parent classloader
07:14:36.182 [main] INFO com.ontotext.trree.big.AVLRepository - NumberOfStatements = 70
07:14:36.182 [main] INFO com.ontotext.trree.big.AVLRepository - NumberOfExplicitStatements = 0
07:14:36.186 [main] INFO com.ontotext.trree.OwlimSchemaRepository - Shutting down entity pool
07:18:48.945 [main] INFO com.ontotext.trree.OwlimSchemaRepository - Entity pool was shut down
07:18:50.363 [main] ERROR com.ontotext.graphdb.importrdf.Preload - Failed to init from a recover point! It will be deleted!
java.io.IOException: Queue file mismatch:/tmp/pos_q
    at com.ontotext.trree.util.CompressedSortedChunksFileQueue.<init>(CompressedSortedChunksFileQueue.java:345)
    at com.ontotext.graphdb.importrdf.RestoreManager.readQueueState(RestoreManager.java:756)
    at com.ontotext.graphdb.importrdf.RestoreManager.initFromPhaseTwoRestorePoint(RestoreManager.java:145)
    at com.ontotext.graphdb.importrdf.RestoreManager.initFromRestorePoint(RestoreManager.java:132)
    at com.ontotext.graphdb.importrdf.Preload.init(Preload.java:739)
    at com.ontotext.graphdb.importrdf.Preload.mainPreloadInternal(Preload.java:282)
    at com.ontotext.graphdb.importrdf.BaseLoadTool.mainInternal(BaseLoadTool.java:203)
    at com.ontotext.graphdb.importrdf.Preload.call(Preload.java:254)
    at com.ontotext.graphdb.importrdf.Preload.call(Preload.java:55)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
    at picocli.CommandLine.access$1300(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
    at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
    at picocli.CommandLine.execute(CommandLine.java:2078)
    at com.ontotext.graphdb.importrdf.ImportRDF.main(ImportRDF.java:31)

Eariler on in the import process, I noticed the following log line. Is this a GraphDB bug where there's an integer overflow on 6258594017 and rolling over to -227.0 MB

18:46:28.304 [repository-manager-0] DEBUG com.ontotext.trree.entitypool.impl.map.PersistedHashMap - keyIndexSize=20000000
18:46:28.636 [repository-manager-0] INFO com.ontotext.trree.entitypool.impl.map.PersistedHashMap - Restoring entity hash table...
19:19:13.676 [repository-manager-0] INFO com.ontotext.trree.entitypool.impl.map.PersistedHashMap - Restored 6258594017 entities allocating -227.0 MB
19:19:13.687 [repository-manager-0] INFO com.ontotext.trree.entitypool.impl.map.PersistedHashMap - Done in 1965039 ms.

Solution

  • There is a similar bug fixed in GraphDB 10.6.3 - https://graphdb.ontotext.com/documentation/10.6/release-notes.html

    The issue is GDB-9998 Integer overflow when using 40bit entities and more than 2B entities

    Hopefully updating the GraphDB version will solve the problem.