Search code examples
neo4jrdfjena

Import RDF (XML or Turtle) into Neo4j


i downloaded the Database dump of Freebase. The format of the file is rdf turtle. I need to get all the data into the neo4j database.

I already wrote an importer with help of tinkerpop.blueprints. First it seemed to work but after 30 minutes of importing an exception occured because the rdf file contained characters at positions where they are not allowed to be. A little bit later (after some investigation) i found out that the jena parser i used (RDFReader) is deprecated and shouldnt be used.

What i need to know now:

Is there any way to import that rdf file into neo4j? Jena is able to transform the data into seven different file formats: .ttl, .rdf, .ne, .jsonld, .owl, .trig, .nq.

Is there an importer for one (or more) of these file formats?


Solution

  • If by importer you mean an executable to which you can pass an RDF file as a parameter, then no, as far as I know. You will have to write code, but probably not very much.

    Your best bet is probably to read the Neo4j Linked Data pages, specifically the blog posts by Michael Bach about importing Turtle Ontologies and Stefanie Wiegand about OWL in Neo4j.

    Since you mention Blueprints, you may want to look at using Sesame and Sail. You should be able to treat Neo4j as a triple store and achieve a convenient interface uniformity with your Freebase triple store. See dbpedia4neo for an example of how this is used for importing DBPedia dumps, your situation should be analogous.

    You indicate that you have trouble parsing the Freebase data, however. If your data is corrupt, you will have to handle that regardless of how you choose to interact with Neo4j. I've had good experiences with Jena's Models, both the default and ontology ones, for various projects, and I'm not sure why you think they shouldn't be used. Is it possible that what you need is to tweak the importer that you have already written, rather than a new approach altogether?