Trying to load a simple json data file into OrientDB using oetl.sh utility.
Here is my input data file(/tmp/databases/test_data1/database.json).
[
{
"id": 1,
"name" : "xyz"
},
{
"id": 2,
"name" : "pqr"
},
{
"id": 3,
"name" : "abc"
}
]
Here is my config json file(/tmp/json_import_config.json).
{
"config": {
"log": "debug"
},
"source" : {
"file": { "path": "/tmp/databases/test_data1/database.json" }
},
"extractor" : {
"json": {}
},
"transformers": [
{
"log": {}
}
],
"loader" : {
"orientdb": {
"dbURL": "plocal:/opt/orientdb/databases/example3",
"dbUser": "admin",
"dbPassword": "admin",
"dbAutoDropIfExists": true,
"dbAutoCreate": true,
"standardElementConstraints": false,
"tx": false,
"wal": false,
"batchCommit": 1000,
"dbType": "document",
"classes": [{"name": "Account"}]
}
},
"end": []
}
Here the command that I am using.
./oetl.sh /tmp/json_import_config.json
Here is the output ....
OrientDB etl v.2.2.20 (build 76ab59e72943d0ba196188ed100c882be4315139) https://www.orientdb.com
[file] INFO Load from file /tmp/databases/test_data1/database.json
[orientdb] INFO Dropping existent database 'plocal:/opt/orientdb/databases/example3'...
BEGIN ETL PROCESSOR
[file] INFO Reading from file /tmp/databases/test_data1/database.json with encoding UTF-8
Started execution with 1 worker threads
+ extracted 0 entries (0 entries/sec) - 0 entries -> loaded 0 documents (0 documents/sec) Total time: 1000ms [0 warnings, 0 errors]
[orientdb] DEBUG - OrientDBLoader: created class 'Account'
[orientdb] DEBUG orientdb: found 0 documents in class 'null'
Start extracting
[0:log] DEBUG Transformer input: {id:1,name:xyz}
Extraction completed
[0:log] INFO {id:1,name:xyz}
[0:log] DEBUG Transformer output: {id:1,name:xyz}
Pipeline execution halted
2018-12-06 13:47:41:386 SEVER {db=example3} ETL process halted: com.orientechnologies.orient.etl.OETLProcessHaltedException: Cannot insert new document {id:1,name:xyz} because it has not class [OETLProcessor$OETLPipelineWorker][orientdb] INFO committing
Pipeline worker done without errors: false
END ETL PROCESSOR
+ extracted 3 entries (15 entries/sec) - 3 entries -> loaded 0 documents (0 documents/sec) Total time: 1190ms [0 warnings, 1 errors]
Need help in resolving this issue. Also would like to know if OrientDB is a good choice for using it only as a document store since did not find many use cases around it. Most of the uses cases are w.r.t. Graph.
You're configuration it's almost right, you need to assign the class to each document being processed by the pipeline. Add a field transformer that set the class name:
"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Account"
}
}],
I tested locally, this is the output from console:
orientdb {db=docDb}> select from Account
+----+-----+-------+----+----+
|# |@RID |@CLASS |id |name|
+----+-----+-------+----+----+
|0 |#25:0|Account|1 |xyz |
|1 |#26:0|Account|2 |pqr |
|2 |#27:0|Account|3 |abc |
+----+-----+-------+----+----+
3 item(s) found. Query executed in 0.006 sec(s).