Search code examples
pythonrdfjson-ldrdflibturtle-rdf

Why is conversion from JSON-LD to Turtle in rdflib not working?


I am trying to make a simple script which converts JSON-LD to Turtle, and from Turtle to JSON-LD on demand.

So far, I was using rdflib, and the Turtle -> JSON-LD part is working fine. I give some context header and a Turtle file, and it produces valid JSON-LD.

However, the way back is not working:

from rdflib import Graph

g = Graph()


#input file name and file format here
g.parse(location=r'C:\Users\franken\PycharmProjects\rdf_tools\LD_conversion\input\test_converted.json', format='json-ld')

#output file name and file format here

g.serialize(destination=r'C:\Users\franken\PycharmProjects\rdf_tools\LD_conversion\output\testbacktoturtle.ttl', format='turtle')

So far, pretty simple code, just a direct use of rdflib.

However, this produces the error:

C:\Users\franken\PycharmProjects\rdf_tools\LD_conversion\input\test_converted.json does not look like a valid URI, trying to serialize this will break.
Traceback (most recent call last):
  File "C:\Users\franken\PycharmProjects\rdf_tools\LD_conversion\main.py", line 24, in <module>
    g.parse(location=r'C:\Users\franken\PycharmProjects\rdf_tools\LD_conversion\input\test_converted.json', format='json-ld')
  File "C:\Python311\Lib\site-packages\rdflib\graph.py", line 1470, in parse
    source = create_input_source(
             ^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\rdflib\parser.py", line 416, in create_input_source
    ) = _create_input_source_from_location(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\rdflib\parser.py", line 478, in _create_input_source_from_location
    input_source = URLInputSource(absolute_location, format)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\rdflib\parser.py", line 285, in __init__
    response: addinfourl = _urlopen(req)
                           ^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\rdflib\parser.py", line 272, in _urlopen
    return urlopen(req)
           ^^^^^^^^^^^^
  File "C:\Python311\Lib\urllib\request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
               ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\urllib\request.py", line 541, in _open
    return self._call_chain(self.handle_open, 'unknown',
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "C:\Python311\Lib\urllib\request.py", line 1419, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: c>

Process finished with exit code 1

I don't understand why this error is occurring, it seems to be trying to access a url, which is not valid. The input data looks like this (note that this input is actually the output of a rdflib conversion of a valid turtle file into this json-ld) All I'm doing is trying to make the conversion circle whole again:

{
  "@context": {
    "@context": {
      "@bio": "<https://bioschemas.org/>",
      "@ex:": "<http://example.com/ns#>",
      "@owl": "<http://www.w3.org/2002/07/owl#>",
      "@qudt": "<http://qudt.org/schema/qudt/>",
      "@rdf": "<http://www.w3.org/1999/02/22-rdf-syntax-ns#>",
      "@rdfs": "<http://www.w3.org/2000/01/rdf-schema#>",
      "@schema": "<http://schema.org/>",
      "@sd": "<https://w3id.org/okn/o/sd#>",
      "@sh": "<http://www.w3.org/ns/shacl#>",
      "@skos": "<http://www.w3.org/2004/02/skos/core#>",
      "@spe": "<https://openschemas.github.io/spec-container/specifications/>",
      "@xsd": "<http://www.w3.org/2001/XMLSchema#>"
    }
  },
  "@id": "https://www.blabla.com/SUBJECT_AREA",
  "@type": "https://blabla.px_variable.com",
  "http://www.w3.org/2004/02/skos/core#prefLabel": {
    "@language": "en",
    "@value": "SUBJECT AREA"
  },
  "http://www.yourprefix.ch/isDefinedAs": [
    {
      "@language": "en",
      "@value": "A definition of something, in english in this case."
    },
    {
      "@language": "fr",
      "@value": "Une definition de quelquechose"
    },
    {
      "@language": "it",
      "@value": "Una “definizione di qualcosa"
    },
    {
      "@language": "de",
      "@value": "Eine Definition von etwas"
    }
  ]
}

It feels a bit like I'm missing something obvious, or some parameter which prevents rdflib from actually trying to open URL's (after all, a URL should not HAVE to be resolvable for it to still be valid JSON). I have tried running the conversion with and without context, and also different json-ld files, but the error remains consistent.

I'm using rdflib 6.3.2 on python 3.11 on Windows. Any tips welcome!


Solution

  • I can't reproduce any problem. I'm using Python 3.11 on Windows 11, with rdflib 6.3.2 installed just now.

    This code writes the JSON string into a file :

    json_path=r'C:\Projects\testbacktoturtle.json'
    
    json='''{
      "@context": {
        "@context": {
          "@bio": "<https://bioschemas.org/>",
          "@ex:": "<http://example.com/ns#>",
          "@owl": "<http://www.w3.org/2002/07/owl#>",
          "@qudt": "<http://qudt.org/schema/qudt/>",
          "@rdf": "<http://www.w3.org/1999/02/22-rdf-syntax-ns#>",
          "@rdfs": "<http://www.w3.org/2000/01/rdf-schema#>",
          "@schema": "<http://schema.org/>",
          "@sd": "<https://w3id.org/okn/o/sd#>",
          "@sh": "<http://www.w3.org/ns/shacl#>",
          "@skos": "<http://www.w3.org/2004/02/skos/core#>",
          "@spe": "<https://openschemas.github.io/spec-container/specifications/>",
          "@xsd": "<http://www.w3.org/2001/XMLSchema#>"
        }
      },
      "@id": "https://www.blabla.com/SUBJECT_AREA",
      "@type": "https://blabla.px_variable.com",
      "http://www.w3.org/2004/02/skos/core#prefLabel": {
        "@language": "en",
        "@value": "SUBJECT AREA"
      },
      "http://www.yourprefix.ch/isDefinedAs": [
        {
          "@language": "en",
          "@value": "A definition of something, in english in this case."
        },
        {
          "@language": "fr",
          "@value": "Une definition de quelquechose"
        },
        {
          "@language": "it",
          "@value": "Una “definizione di qualcosa"
        },
        {
          "@language": "de",
          "@value": "Eine Definition von etwas"
        }
      ]
    }'''
    
    
    with open(json_path,"w",encoding="utf-8") as f:
        f.write(json)
    

    This code loads it and converts it succesfully:

    
    from rdflib import Graph
    
    g = Graph()
    
    g.parse(location=json_path, format='json-ld')
    
    #output file name and file format here
    
    g.serialize(destination=r'C:\Projects\testbacktoturtle.ttl', format='turtle')