I have an XML file whose structure is similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<drugbank xmlns="http://www.drugbank.ca" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.drugbank.ca http://www.drugbank.ca/docs/drugbank.xsd" version="5.0" exported-on="2017-12-20">
<drug type="biotech" created="2005-06-13" updated="2017-11-06">
<drugbank-id primary="true">DB00001</drugbank-id>
<drugbank-id>BTD00024</drugbank-id>
<drugbank-id>BIOD00024</drugbank-id>
<cas-number>138068-37-8</cas-number>
<name>Lepirudin</name>
</drug>
<drug type="biotech" created="2005-06-13" updated="2017-11-06">
<drugbank-id primary="true">DB00045</drugbank-id>
<drugbank-id>BTD00054</drugbank-id>
<drugbank-id>BIOD00054</drugbank-id>
<cas-number>205923-56-4</cas-number>
<name>Lyme disease vaccine (recombinant OspA)</name>
</drug>
</drugbank>
I am trying to utilize cElementTree module of Python 3. I would like to extract the name of each drug in this XML, for which I have written the following code:
import xml.etree.cElementTree as ET
tree = ET.parse('fulldatabase.xml')
drugbank = tree.getroot()
print(drugbank.tag)
for drug in drugbank:
print(drug.find('name').text)
The error I get is AttributeError: 'NoneType' object has no attribute 'text'
I have also tried checking this but the answer the OP wrote in it did not work for me. Is there any way to get name
and cas-number
field out of each drug. I have tried some combinations like removing findall()
in the for loop condition, but things did not work for me even then.
Do you need anything besides the name? If not this will do it. You're not using the xml
namespace properly as defined in the <drugbank xmlns="http://www.drugbank.ca"
portion of the file
for drug in drugbank.iter('{http://www.drugbank.ca}name'):
print drug.text
Lepirudin
Lyme disease vaccine (recombinant OspA)
Here's another way to get the elements you need:
for child in drugbank.getchildren():
print {'cas-number': child.find('{http://www.drugbank.ca}cas-number').text, 'name': child.find('{http://www.drugbank.ca}name').text}
{'cas-number': '138068-37-8', 'name': 'Lepirudin'}
{'cas-number': '205923-56-4', 'name': 'Lyme disease vaccine (recombinant OspA)'}