I have a couple lines of (I think) RDF data
<http://www.test.com/meta#0001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class>
<http://www.test.com/meta#0002> <http://www.test.com/meta#CONCEPT_hasType> "BEAR"^^<http://www.w3.org/2001/XMLSchema#string>
Each line has 3 items in it. I want to pull out the item before and after the URL. So that would result in:
0001, type, Class
0002, CONCEPT_hasType, (BEAR, string)
Is there a library out there (java or scala) that would do this split for me? Or do I just need to shove string.splits and assumptions in my code?
Most RDF libraries will have something to facilitate this. For example, if you parse your RDF data using Eclipse RDF4J's Rio parser, you will get back each line as a org.eclipse.rdf4j.model.Statement
, with a subject, predicate and object value. The subject in both your lines will be an org.eclipse.rdf4j.model.IRI
, which has a getLocalName()
method you can use to get the part behind the last #. See the Javadocs for more details.
Assuming your data is in N-Triples syntax (which it seems to be given the example you showed us), here's a simple bit of code that does this and prints it out to STDOUT:
// parse the file into a Model object
InputStream in = new FileInputStream(new File("/path/to/rdf-data.nt"));
org.eclipse.rdf4j.model.Model model = Rio.parse(in, RDFFormat.NTRIPLES);
for (org.eclipse.rdf4j.model.Statement st: model) {
org.eclipse.rdf4j.model.Resource subject = st.getSubject();
if (subject instanceof org.eclipse.rdf4j.model.IRI) {
System.out.print(((IRI)subject).getLocalName());
}
else {
System.out.print(subject.stringValue());
}
// ... etc for predicate and object (the 2nd and 3rd elements in each RDF statement)
}
Update if you don't want to read data from a file but simply use a String
, you could just use a java.io.StringReader
instead of an InputStream
:
StringReader r = new StringReader("<http://www.test.com/meta#0001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .");
org.eclipse.rdf4j.model.Model model = Rio.parse(r, RDFFormat.NTRIPLES);
Alternatively, if you don't want to parse the data at all and just want to do String processing, there is a org.eclipse.rdf4j.model,URIUtil class which you can just feed a string and it can give you back the index of the local name part:
String uri = "http://www.test.com/meta#0001";
String localpart = uri.substring(URIUtil.getLocalNameIndex(uri)); // will be "0001"
(disclosure: I am on the RDF4J development team)