Search code examples
jsonjsoniq

How do I import a JSON file into a JSONiq collection?


I have looked everywhere, and even the JSONiq documentation says "this is beyond the scope of this document." I have a JSON file (an array of JSON objects) I want to import into JSONiq (particularly Zorba, which by the way is a terrible name because it makes Internet searches for information futile) to use as a collection to query. Is there a tutorial, or spec, or anything anywhere that tells me how to do this?


Solution

  • Zorba supports adding documents to a collection. The framework for doing so is documented here. Note, however, that Zorba is a memory store and will not persist anything beyond the scope of one query, so that this is of limited use without a persistence layer.

    If the use case is simply to query a JSON file stored on your local drive, then it may be simpler to use EXPath's file module as well as parse-json, like so:

    jsoniq version "1.0";
    
    import module namespace file = "http://expath.org/ns/file";
    
    let $my-object := parse-json(file:read-text("/path/to/document.json"))
    return $my-object.foo
    

    The above query returns "bar" if /path/to/document.json contains

    { "foo" : "bar" } 
    

    parse-json gives you additional options to parse documents with multiple objects (JSON lines, etc).

    For advanced users, this is how to use collections to avoid reading the file(s) every time:

    jsoniq version "1.0";
    
    import module namespace file = "http://expath.org/ns/file";
    import module namespace ddl = "http://zorba.io/modules/store/dynamic/collections/ddl";
    import module namespace dml = "http://zorba.io/modules/store/dynamic/collections/dml";
    
    (: Populating the collection :)
    variable $my-collection := QName("my-collection");
    ddl:create($my-collection, parse-json(file:read-text("/tmp/doc.json")));
    
    (: And now the query :)
    
    for $object in dml:collection($my-collection)
    group by $value := $object.foo
    return {
      "value" : $value,
      "count" : count($object)
    }
    

    This is /tmp/doc.json:

    { "foo" : "bar" }
    { "foo" : "bar" }
    { "foo" : "foo" }
    { "foo" : "foobar" }
    { "foo" : "foobar" }
    

    And the query above returns:

    { "value" : "bar", "count" : 2 }
    { "value" : "foobar", "count" : 2 }
    { "value" : "foo", "count" : 1 }