I have a xml
file that lists speakers:
<speakerlist>
<speaker>
<title>Dr.</titel>
<firstname>Bernd</firstname>
<lastname>Baumann</lastname>
</speaker>
<speakerid="11003218">
<firstname>Karsten</firstname>
<lastname>Schneider</lastname>
<info>(Erfurt)</info>
</speaker>
...
<speakerlist>
Some of the speaker attributes are always given (firstname
, lastname
) while others are optional (title
, info
). I want to extract the names with the additional info in a straightforward way.
Just the name is easy, using beatifulsoup:
[speaker.find("firstname").text + " " + speaker.find("lastname").text for speaker in speakerlist.find_all("speaker")]
But how can I prepend the title
if existing? I tried
[
speaker.find("title").text + " " + speaker.find("firstname").text + " " + speaker.find("lastname").text
if speaker.find("title").text is not None
else speaker.find("firstname").text + " " + speaker.find("lastname").text
for speaker in speakerlist.find_all("speaker")
]
but this throws
'NoneType' object has no attribute 'text'
when the title
attribute does not exist. I understand why this happens, but I don't see a workaround.
Is there a nice and cohesive way for a one-liner to extract the information I want?
I see no good reason to try and squeeze this into an "one-liner" list comprehension.
def format_speaker_record(speaker):
title_tag = speaker.find("title")
title = (title.text if title_tag else None)
firstname = speaker.find("firstname").text
lastname = speaker.find("lastname").text
if title:
return f"{title} {firstname} {lastname}"
return f"{firstname} {lastname}"
speakers = [format_speaker_record(speaker) for speaker in speakerlist.find_all("speaker")]