Search code examples
orientdborientdb2.2orientdb-etl

Is it possible to import line-wise JSON into OrientDB using their ETL tool?


I have a bunch of files (~10Gb each) where each line represents a single JSON object. I want to import them in the streaming mode, but looks like it is not supported right now (OrientDB v.2.2.12). Are there any workarounds? And what is the recommended way for this case?


Solution

  • Looks like that JSON can be transformed to the ODocument in CODE block:

    {
        "code": {
            "language": "Javascript",
            "code": "(new com.orientechnologies.orient.core.record.impl.ODocument()).fromJSON(input);"
        }
    }
    

    If you experience errors like:

    Error in Pipeline execution: com.orientechnologies.orient.core.exception.OSerializationException: Found invalid } character at position 112 of text

    Then just ensure that multiline option is set to off.

    "extractor": {
        "row": {
            "multiLine": false
        }
    }