This is related to my earlier questions
In response to (2), I am looking at modifying my nazca geoglyph dataset to use the WKT version to be consistent with the newer OrientDB 2.2.x Spatial Index functionality.
My input CSV file, nazca_lines_wkt.csv
is this:
Name,Location
Hummingbird,POINT(-75.148892 -14.692131)
Monkey,POINT(-75.138532 -14.706940)
Condor,POINT(-75.126208 -14.697444)
Spider,POINT(-75.122381 -14.694145)
Spiral,POINT(-75.122746 -14.688277)
Hands,POINT(-75.113881 -14.694459)
Tree,POINT(-75.114520 -14.693898)
Astronaut,POINT(-75.079755 -14.745222)
Dog,POINT(-75.130788 -14.706401)
Wing,POINT(-75.100385 -14.680309)
Parrot,POINT(-75.107498 -14.689463)
I create an empty PLOCAL database, nazca-wkt.orientdb
and define a GeoGlyphWKT vertex class:
CREATE DATABASE PLOCAL:nazca-wkt.orientdb admin admin plocal graph
CREATE CLASS GeoGlyphWKT EXTENDS V
CREATE PROPERTY GeoGlyphWKT.Name STRING
CREATE PROPERTY GeoGlyphWKT.Location EMBEDDED OPoint
CREATE PROPERTY GeoGlyphWKT.Tag EMBEDDEDSET STRING
I have two .json files that I use for the oetl script:
nazca_lines_wkt.json
{
"config": {
"log": "info",
"fileDirectory": "./",
"fileName": "nazca_lines_wkt.csv"
}
}
commonGeoGlyphWKT.json
{
"begin": [ { "let": { "name": "$filePath", "expression": "$fileDirectory.append($fileName )" } } ],
"config": { "log": "debug" },
"source": { "file": { "path": "$filePath" } },
"extractor":
{
"csv": { "ignoreEmptyLines": true,
"nullValue": "N/A",
"separator": ",",
"columnsOnFirstLine": true,
"dateFormat": "yyyy-MM-dd"
}
},
"transformers": [ { "vertex": { "class": "GeoGlyphWKT" } } ],
"loader": {
"orientdb": {
"dbURL": "plocal:nazca-wkt.orientdb",
"dbType": "graph",
"batchCommit": 1000
}
}
}
I run oetl using this command:
$ oetl.sh commonGeoGlyphWKT.json nazca_lines_wkt.json
but this fails with the following output:
$ oetl.sh commonGeoGlyphWKT.json nazca_lines_wkt.json
OrientDB etl v.2.2.13 (build 2.2.x@r90d7caa1e4af3fad86594e592c64dc1202558ab1; 2016-11-15 12:04:05+0000) www.orientdb.com
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./nazca_lines_wkt.csv with encoding UTF-8
Started execution with 1 worker threads
Error in Pipeline execution: com.orientechnologies.orient.core.exception.OValidationException: impossible to convert value of field "Location"
DB name="nazca-wkt.orientdb"
ETL process has problem: java.util.concurrent.ExecutionException: com.orientechnologies.orient.core.exception.OValidationException: impossible to convert value of field "Location"
DB name="nazca-wkt.orientdb"
END ETL PROCESSOR
+ extracted 9 rows (0 rows/sec) - 9 rows -> loaded 0 vertices (0 vertices/sec) Total time: 16ms [0 warnings, 1 errors]
I'm sure it's something silly that I'm missing... has anyone been able to import CSV files containing WKT strings for points, polygons, etc using ETL?
Any help is appreciated!
this is working for me:
commonGeoGlyphWKT.json
{
"source": { "file": { "path": "./nazca_lines_wkt.csv" } },
"extractor": { "csv": {
"separator": ",",
"columns": ["Name:String","Location:String"] } },
"transformers": [
{ "command": { "command": "INSERT INTO GeoGlyphWKT(Name,Location) values('${input.Name}', St_GeomFromText('${input.Location}'))"} }
],
"loader": {
"orientdb": {
"dbURL": "plocal:/home/ivan/OrientDB/db_installati/enterprise/orientdb-enterprise-2.2.13/databases/stack40982509-spatial",
"dbUser": "admin",
"dbPassword": "admin",
"dbType": "graph",
"batchCommit": 1000
}
}
}
nazca_lines_wkt.csv
Name,Location
Hummingbird,POINT (-75.148892 -14.692131)
Monkey,POINT (-75.138532 -14.706940)
Condor,POINT(-75.126208 -14.697444)
Spider,POINT(-75.122381 -14.694145)
Spiral,POINT(-75.122746 -14.688277)
Hands,POINT(-75.113881 -14.694459)
Tree,POINT(-75.114520 -14.693898)
Astronaut,POINT(-75.079755 -14.745222)
Dog,POINT(-75.130788 -14.706401)
Wing,POINT(-75.100385 -14.680309)
Parrot,POINT(-75.107498 -14.689463)
[ivan@canemagico-pc bin]$ ./oetl.sh commonGeoGlyphWKT2.json
OrientDB etl v.2.2.13 (build 2.2.x@r90d7caa1e4af3fad86594e592c64dc1202558ab1; 2016-11-15 12:04:05+0000) www.orientdb.com
[csv] INFO column types: {Name=STRING, Location=STRING}
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./nazca_lines_wkt.csv with encoding UTF-8
Started execution with 1 worker threads
[orientdb] INFO committing
END ETL PROCESSOR
+ extracted 11 rows (0 rows/sec) - 11 rows -> loaded 11 vertices (0 vertices/sec) Total time: 244ms [0 warnings, 0 errors]
orientdb {db=stack40982509-spatial}> select from GeoGlyphWKT
+----+-----+-----------+-----------+-----------------------+
|# |@RID |@CLASS |Name |Location |
+----+-----+-----------+-----------+-----------------------+
|0 |#25:0|GeoGlyphWKT|Hummingbird|OPoint{coordinates:[2]}|
|1 |#25:1|GeoGlyphWKT|Spiral |OPoint{coordinates:[2]}|
|2 |#25:2|GeoGlyphWKT|Dog |OPoint{coordinates:[2]}|
|3 |#26:0|GeoGlyphWKT|Monkey |OPoint{coordinates:[2]}|
|4 |#26:1|GeoGlyphWKT|Hands |OPoint{coordinates:[2]}|
|5 |#26:2|GeoGlyphWKT|Wing |OPoint{coordinates:[2]}|
|6 |#27:0|GeoGlyphWKT|Condor |OPoint{coordinates:[2]}|
|7 |#27:1|GeoGlyphWKT|Tree |OPoint{coordinates:[2]}|
|8 |#27:2|GeoGlyphWKT|Parrot |OPoint{coordinates:[2]}|
|9 |#28:0|GeoGlyphWKT|Spider |OPoint{coordinates:[2]}|
|10 |#28:1|GeoGlyphWKT|Astronaut |OPoint{coordinates:[2]}|
+----+-----+-----------+-----------+-----------------------+
11 item(s) found. Query executed in 0.013 sec(s).