Search code examples
xmlnomenclature

XML element names


I need to redefine an XML document and schema for my company. The document in question is split into a number of sections that each contain information about a medication, for example;

<dosage>overview of dose info
   <elderly>doses for elderly patients</elderly>
   <children>doses for children</children>
</dosage>
<administration>info about administering the med...</administration>

I strongly believe that the element names should be changed to reflect what the element is eg <section> with an attribute describing the content: <section displayName='dosage'>. Not all of my colleagues agree.

Is my thinking correct and can anyone provide guiding principles for element nomenclature that they have found useful in practice?


Solution

  • Consider the case of elderly and children. The tag should define what it is -- in this case they are both dosage instructions specific to a certain type of person. But using children and elderly doesn't communicate this information -- there's no relationship there. If instead it were <instructions target="elderly">...</instructions>, that relationship is maintained. Both are instructions for different targets.

    For the dosage and administration sections, both of those could be considered to be properties of the medication. What you do here depends on the structure of the whole document and how it will be parsed. It seems to me that dosage is very distinct from administration. If you were defining this as an object in an OOL, you would have:

    class Medication
    {
        Dictionary<string, string> dosageInstructions; //or <PersonType, string>, preferably
        string administrationInfo;
    }
    

    Both of these are different properties, and there's no real parallel between them (well, other than that they're both properties of the medication). I don't think it would be useful to abstract that any more than it already is, but it's something that could be argued either way based on the structure of the entire document and how it's going to be used.

    For example, if you are going to print out a list of key-value pairs, (for example, one key is administration and that value is the info) for a bunch of different properties, then that's the way to go. But dosage has a distinct structure from administration, so I don't think that that particular abstraction would be useful. If every medication has a fixed set of possible properties (dosage, administration info, etc) that will all be treated differently, then in my opinion it would be logical to use distinct tags for all of them.

    As far as general guiding principles, I generally think "how would I define this document as an object," then consider what the XML serialization of that object would be. This works for me because I'm far more used to working with objects, but your mileage may vary. And there are certainly cases where that's not the best approach -- for example, if you're truly representing a document, like HTML, then that's not the way to go. But if you're using XML to define a regular data structure, it should generally work.