Search code examples
jsonxmlxsdsemanticss-expression

Is the array structure redundant in json?


In https://www.json.org/, it said:

JSON is built on two structures:

A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.

An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

My question is:

Why json is designed to use two types of data structures instead of one ? (name/value pairs only)

For instance,

I use the following two methods to describe the School structure (School contains multiple Students):

1.

{
    "Student" : [
        { "Name" : "Peter", "Sex" : "Male" },
        { "Name" : "Linda", "Sex" : "Female" },
    ]
}

2.

{
    "Student" : {
        "0" : { "Name" : "Peter", "Sex" : "Male" },
        "1" : { "Name" : "Linda", "Sex" : "Female" },
    }
}

Which is better?

I like the second one.

Why?

Because in my opinion, the member "Student" is array or map, ordered or unordered, bounded or unbounded, should be defined in its meta data instead of instance data.

The two school json data above are both instance data! (note: the symbol '[]' represent "type infomation" which should be defined in meta data. It's redundant now...)

When I use XPath like syntax to access "Name" member in the example above, they are no different:

  1. School.Student[0].Name <---[0] as array index
  2. School.Student[0].Name <---[0] as map key

The same example in XML which has only one way to express:

<School>
    <Student Name="Peter" Sex="Male"/>
    <Student Name="Linda" Sex="Female"/>
</School>

I heard someone claiming that XML is redundant because of it's ending tags and attributes (relative to s-expression).

But I think the redundancy of XML is only in grammar, the redundancy of Json is in semantic.

Am I right? Very thanks.


Solution

  • Redundant for what?

    Discussing the theories of data transfer? Then yes, it's redundant as the other structure it supports can represent arrays.

    Balancing the concerns of sparseness, representational integrity, simplicity for human readers and writers, simplicity for software parsers and generators, speed of parsing and generating, simplicity of mapping to data structures and simplicity of validation — that is to say, the actual design goals — then no. Being able to directly encode arrays has benefits for the sparseness, simplicity of humans, simplicity of validation (if it's a valid array in syntax it's a valid array, which doesn't necessarily follow from maps that could omit indices without a separate application-specific rule to handle that) and speed of producing (if something maps on increasing indices then we are forced to potentially catch someone mixing them up). What is a redundancy in the sense of analysing what can be represented is not redundant in the sense of not offering a practical benefit.

    In the second case of "beyond what we need or want" redundant has a rather pejorative connotation while in other cases redundant can be a positive thing (having redundancy gives you room to deal with loss, ability to catch errors and so on in different cases). The redundancy here is of the second sort; we could use a version of JSON that had no arrays, so we don't strictly need it, but life is a lot easier because we do have it and almost nobody is going to go to the effort of producing a horrible mapping of ascending integers to elements just to make the life of those parsing it harder.