Search code examples
rxpathrvestxml2sbml

Reading an sbml file using xml2 in R using xpath


I am really new to xml and I am trying to read an sbml file using the xml2 package in R.

The demo sbml file is taken from the sbml main page.

I am confused as to how to search for a node using the xpath.

E.g., I tried

test <- read_xml("./scratch.xml")
xml_children(test)[1]
xml_attr(xml_children(test)[1], "name")

works and gives me "EnzymaticReaction" as answer. But, I don't want to access nodes by indices, but by name - so I tried

xml_find_one(test, ".//model")

which gives me error

Error: No matches

Can anyone help me as to what I am doing wrong in calling by xpath? The sbml file is pasted below also.

Thanks!

    <?xml version="1.0" encoding="UTF-8"?>
<sbml level="2" version="3" xmlns="http://www.sbml.org/sbml/level2/version3">
    <model name="EnzymaticReaction">
        <listOfUnitDefinitions>
            <unitDefinition id="per_second">
                <listOfUnits>
                    <unit kind="second" exponent="-1"/>
                </listOfUnits>
            </unitDefinition>
            <unitDefinition id="litre_per_mole_per_second">
                <listOfUnits>
                    <unit kind="mole"   exponent="-1"/>
                    <unit kind="litre"  exponent="1"/>
                    <unit kind="second" exponent="-1"/>
                </listOfUnits>
            </unitDefinition>
        </listOfUnitDefinitions>
        <listOfCompartments>
            <compartment id="cytosol" size="1e-14"/>
        </listOfCompartments>
        <listOfSpecies>
            <species compartment="cytosol" id="ES" initialAmount="0"     name="ES"/>
            <species compartment="cytosol" id="P"  initialAmount="0"     name="P"/>
            <species compartment="cytosol" id="S"  initialAmount="1e-20" name="S"/>
            <species compartment="cytosol" id="E"  initialAmount="5e-21" name="E"/>
        </listOfSpecies>
        <listOfReactions>
            <reaction id="veq">
                <listOfReactants>
                    <speciesReference species="E"/>
                    <speciesReference species="S"/>
                </listOfReactants>
                <listOfProducts>
                    <speciesReference species="ES"/>
                </listOfProducts>
                <kineticLaw>
                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                        <apply>
                            <times/>
                            <ci>cytosol</ci>
                            <apply>
                                <minus/>
                                <apply>
                                    <times/>
                                    <ci>kon</ci>
                                    <ci>E</ci>
                                    <ci>S</ci>
                                </apply>
                                <apply>
                                    <times/>
                                    <ci>koff</ci>
                                    <ci>ES</ci>
                                </apply>
                            </apply>
                        </apply>
                    </math>
                    <listOfParameters>
                        <parameter id="kon"  value="1000000" units="litre_per_mole_per_second"/>
                        <parameter id="koff" value="0.2"     units="per_second"/>
                    </listOfParameters>
                </kineticLaw>
            </reaction>
            <reaction id="vcat" reversible="false">
                <listOfReactants>
                    <speciesReference species="ES"/>
                </listOfReactants>
                <listOfProducts>
                    <speciesReference species="E"/>
                    <speciesReference species="P"/>
                </listOfProducts>
                <kineticLaw>
                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                        <apply>
                            <times/>
                            <ci>cytosol</ci>
                            <ci>kcat</ci>
                            <ci>ES</ci>
                        </apply>
                    </math>
                    <listOfParameters>
                        <parameter id="kcat" value="0.1" units="per_second"/>
                    </listOfParameters>
                </kineticLaw>
            </reaction>
        </listOfReactions>
    </model>
</sbml>

Solution

  • In your input document, there is a default namespace:

    <sbml xmlns="http://www.sbml.org/sbml/level2/version3">
    

    that applies to all elements by default. An XPath expression like

    //model
    

    means looking for an element in no namespace - but in your document, there are no model nodes that are in no namespace.

    I am not familiar with R, so I can only suggest something that is more of a workaround than an answer. The workaround is to not directly mention the name of the element, but use an XPath expression like

    //*[local-name() = 'model']
    

    But ignoring namespaces is not as good as explicitly mentioning them in your code.


    Meanwhile, I've read about this here...

    The real solution would be to use a method to declare the namespace URI from your input document in your R code, and use a prefix in the XPath expression. I think that the right way to do it would be

    ns <- xml_ns_rename(xml_ns(test), d1 = "sbml")
    xml_find_one(test, "/sbml:sbml/sbml:model", ns)
    

    Renaming is not strictly necessary, but it is helpful. The default namespaces in an XML document are named d1, d2 and so on by this XML library.