Search code examples
csvimportjenadata-conversion

How can I import a CSV to RDF using Apache Jena?


Is it possible to convert from command line csv to rdf using Apache Jena?

Is it possible to supply meta data, helping in conversion?

Example of dumb try with riot:

./apache-jena-3.3.0/bin/riot --base='http://example.com/csvtest/' --syntax=csv --output=ttl csv_dbs_examples/csv_inputs/CDs.csv 
java.lang.NullPointerException
        at org.apache.jena.ext.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:770)
        at org.apache.jena.ext.com.google.common.cache.LocalCache.get(LocalCache.java:4153)
        at org.apache.jena.ext.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:5060)
        at org.apache.jena.atlas.lib.cache.CacheGuava.getOrFill(CacheGuava.java:58)
        at org.apache.jena.riot.system.IRIResolver$IRIResolverNormal.resolveSilentCache(IRIResolver.java:470)
        at org.apache.jena.riot.system.IRIResolver$IRIResolverNormal.resolveSilent(IRIResolver.java:454)
        at org.apache.jena.riot.system.IRIResolver.resolve(IRIResolver.java:328)
        at org.apache.jena.riot.system.IRIResolver$IRIResolverSync.resolve(IRIResolver.java:489)
        at org.apache.jena.riot.system.IRIResolver.resolveIRI(IRIResolver.java:254)
        at org.apache.jena.riot.system.IRIResolver.resolveString(IRIResolver.java:233)
        at org.apache.jena.riot.lang.ReaderRIOTCSV.parse(ReaderRIOTCSV.java:89)
        at org.apache.jena.riot.lang.ReaderRIOTCSV.read(ReaderRIOTCSV.java:67)
        at org.apache.jena.riot.RDFParser.read(RDFParser.java:293)
        at org.apache.jena.riot.RDFParser.parseNotUri(RDFParser.java:283)
        at org.apache.jena.riot.RDFParser.parse(RDFParser.java:233)
        at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:286)
        at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:216)
        at riotcmd.CmdLangParse.exec$(CmdLangParse.java:161)
        at riotcmd.CmdLangParse.exec(CmdLangParse.java:127)
        at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
        at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
        at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
        at riotcmd.riot.main(riot.java:34)

Solution

  • There is http://jena.apache.org/documentation/csv/ (which is being called in the question) but this is not CSVW (the W3C standard). There are several CSVW conversion tools - you can convert to RDF then read the RDF into Jena.

    The actual stacktrace is a bug in 3.3.0 (Apache Jena 3.2.0 should work).

    Update from @GrzegorzWierzowiecki: Confirmed that it looks like bug in Jena 3.3.0, as it works with Jena 3.1.1