Search code examples
jsonetlorientdborientdb2.2nosql

Unable to load JSON into OrientDB Class


Trying to load a simple json data file into OrientDB using oetl.sh utility.

Here is my input data file(/tmp/databases/test_data1/database.json).

[
 {
   "id": 1,
   "name" : "xyz"
 },
 {
   "id": 2,
   "name" : "pqr"
 },
 {
   "id": 3,
   "name" : "abc"
 }
]

Here is my config json file(/tmp/json_import_config.json).

{
  "config": {
    "log": "debug"
},
"source" : {
"file": { "path": "/tmp/databases/test_data1/database.json" }
},
"extractor" : {
"json": {}
},
"transformers": [
    {
      "log": {}
    }
],
"loader" : {
    "orientdb": {
      "dbURL": "plocal:/opt/orientdb/databases/example3",
      "dbUser": "admin",
      "dbPassword": "admin",
      "dbAutoDropIfExists": true,
      "dbAutoCreate": true,
      "standardElementConstraints": false,
      "tx": false,
      "wal": false,
      "batchCommit": 1000,
      "dbType": "document",
      "classes": [{"name": "Account"}]
   }
 },
 "end": []
}

Here the command that I am using.

./oetl.sh /tmp/json_import_config.json

Here is the output ....

OrientDB etl v.2.2.20 (build 76ab59e72943d0ba196188ed100c882be4315139) https://www.orientdb.com
[file] INFO Load from file /tmp/databases/test_data1/database.json
[orientdb] INFO Dropping existent database 'plocal:/opt/orientdb/databases/example3'...
BEGIN ETL PROCESSOR
[file] INFO Reading from file /tmp/databases/test_data1/database.json with encoding UTF-8
Started execution with 1 worker threads
+ extracted 0 entries (0 entries/sec) - 0 entries -> loaded 0 documents (0 documents/sec) Total time: 1000ms [0 warnings, 0 errors]
[orientdb] DEBUG - OrientDBLoader: created class 'Account'
[orientdb] DEBUG orientdb: found 0 documents in class 'null'
Start extracting
[0:log] DEBUG Transformer input: {id:1,name:xyz}
Extraction completed
[0:log] INFO {id:1,name:xyz}
[0:log] DEBUG Transformer output: {id:1,name:xyz}
Pipeline execution halted

2018-12-06 13:47:41:386 SEVER {db=example3} ETL process halted: com.orientechnologies.orient.etl.OETLProcessHaltedException: Cannot insert new document {id:1,name:xyz} because it has not class [OETLProcessor$OETLPipelineWorker][orientdb] INFO committing
Pipeline worker done without errors: false
END ETL PROCESSOR
+ extracted 3 entries (15 entries/sec) - 3 entries -> loaded 0 documents (0 documents/sec) Total time: 1190ms [0 warnings, 1 errors]

Need help in resolving this issue. Also would like to know if OrientDB is a good choice for using it only as a document store since did not find many use cases around it. Most of the uses cases are w.r.t. Graph.


Solution

  • You're configuration it's almost right, you need to assign the class to each document being processed by the pipeline. Add a field transformer that set the class name:

    "transformers": [
    {
      "log": {}
    },
    {
      "field": {
        "fieldName": "@class",
        "value": "Account"
      }
    }],
    

    I tested locally, this is the output from console:

        orientdb {db=docDb}> select from Account
    
    +----+-----+-------+----+----+
    |#   |@RID |@CLASS |id  |name|
    +----+-----+-------+----+----+
    |0   |#25:0|Account|1   |xyz |
    |1   |#26:0|Account|2   |pqr |
    |2   |#27:0|Account|3   |abc |
    +----+-----+-------+----+----+
    
    3 item(s) found. Query executed in 0.006 sec(s).