Search code examples
xpathrdfpython-3.5rdf-xml

How to limit the Scope of element extraction between the start and end tag of a particular xml element using XPath in Python?


I have an RDF/XML Element and would like to find out all the elements between the start and end of a particular tag. How could I do that?

for example :

<cim:BaseVoltage rdf:ID="_0526B48408F744919E7C03672FCD0D71">       
<cim:BaseVoltage.isDC>false</cim:BaseVoltage.isDC>  
<cim:BaseVoltage.nominalVoltage>400.000000000</cim:BaseVoltage.nominalVoltage>    
</cim:BaseVoltage>

I would like to extract the values BaseVoltage.isDC and BaseVoltage.nominalVoltage, since they are between the start and end tag of . As mentioned this is just an example and I have many more such starting and ending tag.

I thought of doing it using Xpath, but am not really sure how.


Solution

  • Parsing the XML File using XPath seemed to be a really bad idea for the question. Rdflib makes it very easy.

    import rdflib
    from rdflib import Graph
    from rdflib.namespace import Namespace
    
    BASE = Namespace('http://example.org/')
    
    graph = rdflib.Graph()
    graph.parse('rdf.xml', format='xml', publicID=BASE)
    
    for p,o in graph[BASE['#_0526B48408F744919E7C03672FCD0D71']]:
       print(p, o)