I'm new in Python and I have a question. I'm trying to parse this xml (this XML has several information, this is the first data what I need to read):
<![CDATA[<?xml version="1.0" encoding="UTF-8"?><UDSObjectList>
<UDSObject>
<Handle>cr:908715</Handle>
<Attributes>
<Attribute DataType="2002">
<AttrName>ref_num</AttrName>
<AttrValue>497131</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>support_lev.sym</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="2004">
<AttrName>open_date</AttrName>
<AttrValue>1516290907</AttrValue>
</Attribute>
<Attribute DataType="58814636">
<AttrName>agt.id</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="2005">
<AttrName>priority</AttrName>
<AttrValue>3</AttrValue>
</Attribute>
<Attribute DataType="2009">
<AttrName>tenant.id</AttrName>
<AttrValue>F3CA8B5A2A456742B21EF8F3B5538623</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>tenant.name</AttrName>
<AttrValue>Ripley</AttrValue>
</Attribute>
<Attribute DataType="2005">
<AttrName>log_agent</AttrName>
<AttrValue>088966043F4D2944AA90067C52DA454F</AttrValue>
</Attribute>
<Attribute DataType="58826268">
<AttrName>request_by.first_name</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="58826268">
<AttrName>request_by.first_name</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="2002">
<AttrName>customer.first_name</AttrName>
<AttrValue>Juan Guillermo</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>customer.last_name</AttrName>
<AttrValue>Mendoza Montero</AttrValue>
</Attribute>
<Attribute DataType="2009">
<AttrName>customer.id</AttrName>
<AttrValue>8C020EBAD32035419D7654CDE510D312</AttrValue>
</Attribute>
<Attribute DataType="2001">
<AttrName>category.id</AttrName>
<AttrValue>1121021012</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>category.sym</AttrName>
<AttrValue>Ripley.Sistemas Financieros.Terminal Financiero.Mensaje de
Error</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>status.sym</AttrName>
<AttrValue>Suspended</AttrValue>
</Attribute>
<Attribute DataType="2009">
<AttrName>group.id</AttrName>
<AttrValue>099621F7BD77C545B65FB65BFE466550</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>group.last_name</AttrName>
<AttrValue>EUS_Zona V Region</AttrValue>
</Attribute>
<Attribute DataType="2001">
<AttrName>zreporting_met.id</AttrName>
<AttrValue>7300</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>zreporting_met.sym</AttrName>
<AttrValue>E-Mail</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>assignee.combo_name</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="2004">
<AttrName>open_date</AttrName>
<AttrValue>1516290907</AttrValue>
</Attribute>
<Attribute DataType="2004">
<AttrName>close_date</AttrName>
<AttrValue/>
</Attribute>
<Attribute DataType="2002">
<AttrName>description</AttrName>
<AttrValue>Asunto :Valaparaiso / Terminal Financiero Error
Nombre Completo :JUAN MENDOZA MONTERO
Ubicación :CCSS VALPARAISO Plaza victoria 1646, VALPARAISO
País :Chile
Telefono :ANEXO 2541
Correo :jmendozam@ripley.cl
Descripción :Error Terminal Financiero
Descartes :N/A</AttrValue>
</Attribute>
<Attribute DataType="2002">
<AttrName>summary</AttrName>
<AttrValue>Santiago / Modificación </AttrValue>
</Attribute>
</Attributes>
</UDSObject>
but when I read the file with this method:
from zeep import Client
import xml.dom.minidom
from xml.dom.minidom import Node
def select():
resultado = []
sid = _client.service.login("User","password")
objectType = 'cr'
whereClause = "group.last_name LIKE 'EUS_ZONA%' AND open_date > 1517454000
AND open_date <
1519786800"
maxRows = -1
attributes = ["ref_num"
,"agt.id"
,"priority"
,"pcat.id"
,"tenant.id"
,"tenant.name"
,"log_agent"
,"request_by.first_name"
,"request_by.last_name"
,"customer.first_name"
,"customer.last_name"
,"customer.id"
,"category.id"
,"category.sym"
,"status.sym"
,"group.id"
,"group.last_name"
,"zreporting_met.id"
,"zreporting_met.sym"
,"assignee.combo_name"
,"open_date"
,"close_date"
,"description"
,"summary"]
minim = _client.service.doSelect(sid=sid, objectType=objectType,
whereClause=whereClause, maxRows= maxRows, attributes= attributes)
dom = xml.dom.minidom.parseString(minim)
nodeList = dom.getElementsByTagName('AttrValue')
for j in range(len(nodeList)):
resultado.append(dom.getElementsByTagName('AttrValue')[j].firstChild.wholeText)
print(resultado[j])
logout = _client.service.logout(sid)
This only print the first AttrValue (ref_num value), what I need to do is add every field of the XML file in resultado array, I need help to print every field from the XML file, someone can help me to that?
Please read and follow How to create a Minimal, Complete, and Verifiable example. You should remove all the server stuff and reduce the size of your sample data.
This snippet follows your code and gets all attribute
elements and then iterates those:
import xml.dom.minidom
from xml.dom.minidom import Node
minim = """<?xml version="1.0" encoding="UTF-8"?>
<udsobjectlist>
<udsobject>
<handle>cr:908715</handle>
<attributes>
<attribute datatype="2002">
<attrname>ref_num</attrname>
<attrvalue>497131</attrvalue>
</attribute>
<attribute datatype="2002">
<attrname>support_lev.sym</attrname>
<attrvalue/>
</attribute>
<attribute datatype="2004">
<attrname>open_date</attrname>
<attrvalue>1516290907</attrvalue>
</attribute>
</attributes>
</udsobject>
</udsobjectlist>
"""
dom = xml.dom.minidom.parseString(minim)
nodeList = dom.getElementsByTagName('attribute')
resultado = []
attributes = ["attrname", "attrvalue"]
for node in nodeList:
a = []
for attribute in attributes:
try:
a.append( node.getElementsByTagName(attribute)[0].firstChild.wholeText)
except AttributeError:
a.append("")
resultado.append(a)
print(resultado)
prints
[['ref_num', '497131'], ['support_lev.sym', ''], ['open_date', '1516290907']]
Even closer to your code:
nodeList = dom.getElementsByTagName('attrvalue')
for node in nodeList:
try:
v = node.firstChild.wholeText
resultado.append(v)
print(v)
except:
pass
print(resultado)
prints
497131
1516290907
['497131', '1516290907']
As suggested in the comments, with ET (although you probably should not access elements by index, but this might get you started):
import xml.etree.ElementTree as ET
root = ET.fromstring(minim)
for child in root[0][1]:
try:
print(child[0].text)
print(child[1].text)
except:
pass
prints
ref_num
497131
support_lev.sym
None
open_date
1516290907