So I have this DOM-like tree that I'm trying to convert to markdown. For example it can look like this
[
{
'type': 'header',
'attr': {
'size': 2
},
'children': [
'A header, ',
{
'type': 'link',
'attr': {
'url': 'https://www.google.com'
},
'children': 'a link inside a header'
}
]
},
'some more text'
]
and the output I want is
## A header, [a link inside a header](https://www.google.com)
some more text
I've tried
def genMd(tree):
md_string = ''
for element in tree:
if type(element) == str:
md_string += element
elif type(element) == dict:
if element['type'] == 'header':
md_string += '{} {}\n'.format('#' * element['attr']['size'], genMd(element['children']))
elif element['type'] == 'link':
md_string += '[{}]({})'.format(genMd(element['children']), element['attr']['url']
# I would add more if statements here for the other cases
return md_string
which works, but it seems very inefficient and I would end up having tons of if statements. I've also tried this
def genMd(tree):
MD_TABLE = {
'header': '\'{} {}\\n\'.format(\'#\' * element[\'attr\'][\'size\'], genMd(element[\'children\']))',
'link': '\'[{}]({})\'.format(genMd(element[\'children\']), element[\'attr\'][\'url\'])'
# More entries here for the other cases
}
md_string = ''
for element in tree:
if type(element) == str:
md_string += element
elif type(element) == dict:
md_string += eval(MD_TABLE[element['type']])
return md_string
and it also works but it still feels wrong.
TL;DR: using if statements just feels wrong, is there a better way to do it?
Another approach could consist of using a generator function to traverse the DOM tree while keeping separate functions to handle the specific formatting of various types:
def markdown(d):
def m_header(a):
yield f"{'#'*a['attr']['size']} "+' '.join(markdown(a['children']))
def m_link(a):
yield f'[{" ".join(markdown(a["children"]))}]({a["attr"]["url"]})'
types = {'header':m_header, 'link':m_link}
for i in ([d] if not isinstance(d, list) else d):
if not isinstance(i, dict):
yield i
else:
yield from types[i['type']](i)
dom = [{'type': 'header', 'attr': {'size': 2}, 'children': ['A header, ', {'type': 'link', 'attr': {'url': 'https://www.google.com'}, 'children': 'a link inside a header'}]}, 'some more text']
print('\n'.join(markdown(dom)))
Output:
## A header, [a link inside a header](https://www.google.com)
some more text
A couple observations:
str.join
, you don't need to continuously concatenate strings via +=
, resulting in cleaner code and increased efficiency.eval
.