I am using the DSE Graphloader to load data from a CSV file to the DSE Graph.
I did the following steps to load the data –
Created the graph:
system.graph('followTopic').create()
Set the Alias:
:remote config alias g followTopic.g
Create the following schema:
schema.propertyKey("id").Text().single().create()
schema.propertyKey("follower").Text().single().create()
schema.propertyKey("name").Text().single().create()
schema.propertyKey("type").Text().single().create()
schema.propertyKey("followed").Text().single().create()
schema.propertyKey("timestamp").Timestamp().single().create()
schema.vertexLabel("record").partitionKey("id").create()
schema.vertexLabel("user").properties("name").create()`
Mappings file:
// CONFIGURATION
// Configures the data loader to create the schema
config create_schema: false, load_new: true, load_threads: 3
// DATA INPUT
// Define the data input source (a file which can be specified via command line arguments)
// inputfiledir is the directory for the input files
inputfiledir = '/home/adminuser/data/'
followTopic = File.csv(inputfiledir + "follow.csv").delimiter(',')
//Specifies what data source to load using which mapper (as defined inline)
load(followTopic).asVertices {
label "record"
key "id"
}
The CSV I used is:
id,follower,followed,type,timestamp
1,@20cburns,topic_/best-friend,topic,5/7/2016 11:03:42 PM +00:00
2,@68,topic_/tears-fall,topic,5/3/2016 2:20:01 AM +00:00
3,@abba,topic_/best-friend,topic,6/15/2016 4:08:24 PM +00:00
…
Then on running the graphloader command I get the below mentioned error -
./graphloader ../followTopinMapping.groovy -filename ../follow.csv -graph followTopic -address localhost
Exception Error Message:
2016-08-22 14:18:00 ERROR DataLoaderImpl:519 - Graph driver attempts exceeded for this operation, logging failure, but no records are present (may have been a schema operation)
com.datastax.dsegraphloader.exception.TemporaryException: com.datastax.driver.core.exceptions.InvalidQueryException: DSE Graph not configured to process queries
at com.datastax.dsegraphloader.impl.loader.driver.DseGraphDriverImpl.executeGraphQuery(DseGraphDriverImpl.java:71)
at com.datastax.dsegraphloader.impl.loader.driver.DseGraphDriverImpl.executeGraphQuery(DseGraphDriverImpl.java:87)
at com.datastax.dsegraphloader.impl.loader.driver.DseGraphDriverImpl.getSchema(DseGraphDriverImpl.java:128)
at com.datastax.dsegraphloader.impl.loader.driver.SafeGraphDriver.lambda$tryGetSchema$14(SafeGraphDriver.java:94)
at com.datastax.dsegraphloader.impl.loader.DataLoaderImpl.execute(DataLoaderImpl.java:194)
at com.datastax.dsegraphloader.impl.loader.DataLoaderBuilder.execute(DataLoaderBuilder.java:101)
at com.datastax.dsegraphloader.cli.Executable.execute(Executable.java:69)
at com.datastax.dsegraphloader.cli.Executable.main(Executable.java:163)
Regarding the error message “DSE Graph not configured to process queries” what do I need to make the configuration for DSE Graph loader to load the data in DSE Graph?
Got help from DataStax guys on this to get this working.
Actually some settings were messed up on the cluster so we just created a new cluster and then just did one setting to enable the DSE Graph service on the nodes of this cluster in the /etc/default/dse file-
GRAPH_ENABLED=1
After that updated the schema as -
schema.propertyKey("id").Text().single().create()
schema.propertyKey("follower").Text().single().create()
schema.propertyKey("name").Text().single().create()
schema.propertyKey("type").Text().single().create()
schema.propertyKey("followed").Text().single().create()
schema.propertyKey("timestamp").Timestamp().single().create()
schema.vertexLabel("record").partitionKey("id").properties("follower", "name", "type", "followed", "timestamp").create()
Also regarding TimeStamp should be a number value for it to be successfully loaded via DSE Graphloader.
After these changes I am able to successfully load the Data via DSE Graphloader.