Search code examples
importmodeletlorientdbdocument

Orientdb Import csv to document Model


I'm trying to import a csv file to document Model in Orientdb using ETL I don't know if this is correct as a newbie , and not a lot of documentation on the document model but What I tried is :

{
  "config": {
    "log": "debug"
  },
  "begin": [],
  "source": {
    "file": {
      "path": "C:/Users/M/Desktop/files/lact.csv"
    }
  },
  "extractor": 
{ "csv": 
      {  "separator": ",", 
         "nullValue": "NULL"
      }
  },
  "transformers": [
    {
      "log": {}
    }
  ],
  "loader": {
    "orientdb": {
      "dbURL": "plocal:../databases/Model_doc",



       "dbType": "document",
      "classes": [
        {
          "name": "Annotations"
        },


      ]
    }
  },
  "end": []
}

I'm getting this saying after displaying a parse of the content of the file: [orientdb] DEBUG orientdb: found 0 documents in class 'null'

Csv File

"Entry","Entry_name","Status","Protein_names","Gene_names","Organism","Length","Cross_reference(STRING)"
"Q29836","1B67_HUMAN","reviewed","HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67)","HLA-B HLAB","Homo sapiens (Human)","362","9606.ENSP00000399168;"
"P30501","1C02_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2)","HLA-C HLAC","Homo sapiens (Human)","366",""
"P30508","1C12_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29960","1C16_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29865","1C18_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18)","HLA-C HLAC","Homo sapiens (Human)","366",""

Solution

  • I tried your code, I have the same message:

    [orientdb] DEBUG orientdb: found 0 documents in class 'null'
    

    but I've been able to import all the data, as you can see from my screenshot.

    enter image description here

    to do that as @RobertoFranchini said, you have to add this:

     "transformers": [
    {
      "log": {}
    },
    {
      "field": {
        "fieldName": "@class",
        "value": "Annotations"
      }
    }
    ],
    

    I made this little change to your csv file:

    Entry,Entry_name,Status,Protein_names,Gene_names,Organism,Length,Cross_reference(STRING)
    Q29836,1B67_HUMAN,reviewed,HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67),HLA-B HLAB,Homo sapiens (Human),362,9606.ENSP00000399168
    P30501,1C02_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2),HLA-C HLAC,Homo sapiens (Human),366,
    P30508,1C12_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12),HLA-C HLAC,Homo sapiens (Human),366,
    Q29960,1C16_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16),HLA-C HLAC,Homo sapiens (Human),366,
    Q29865,1C18_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18),HLA-C HLAC,Homo sapiens (Human),366,
    

    and all the data has been imported.

    Hope it helps.

    Regards.