Search code examples
csvorientdbetl

OrientDB ETL from CSV DateTime


This is currently my config file

{
	"config": {
		"haltOnError": false
	},
	"source": {
		"file": {
			"path": "/home/user1/temp/real_user/user3.csv"
		}
	},
	"extractor": {
		"csv": {
			"columns": ["id", "name", "token", "username", "password", "created", "updated", "enabled", "is_admin", "is_banned", "userAvatar"],
			"columnsOnFirstLine": true
		},
		"field": {
			"fieldName": "created",
			"expression": "created.asDateTime()"
		}
	},
	"transformers": [{
		"vertex": {
			"class": "user"
		}
	}],
	"loader": {
		"orientdb": {
			"dbURL": "plocal:/home/user1/orientdb/real_user",
			"dbAutoCreateProperties": true,
			"dbType": "graph",
			"classes": [{
				"name": "user",
				"extends": "V"
			}],
			"indexes": [{
				"class": "user",
				"fields": ["id:long"],
				"type": "UNIQUE"
			}]
		}
	}
}

and my csv currently looks like this

6,Olivia Ong,2jkjkl54k5jklj5k4j5k4jkkkjjkj,\N,\N,2013-11-15 16:36:33,2013-11-15 16:36:33,1,0,\N,\N
7,Matthew,32kj4h3kjh44hjk3hk43hkhhkjhasd,\N,\N,2013-11-18 17:29:13,2013-11-15 16:36:33,1,0,\N,\N

I still wonder when I execute the ETL, orientdb wont recognize my datetime as datetime.

I tried putting datatype in column fields "created:datetime", but it ended up not showing any data.

I wonder what is the proper solution for this case.


Solution

  • from next version, 2.2.8, you will be able to define different default pattern for date and datetime: CSV extractor documentation

    Note that when you define the columns, you need to specify the column's type:

                "columns": ["id:string", "created:date", "updated:datetime"],
    

    You can use the snapshot jar of 2.2.8 of ETL module with 2.2.7 without any problem:

    https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-etl/2.2.8-SNAPSHOT/