Search code examples
pythonjsonmarkdownformat-conversionpypandoc

How to convert JSON object to markdown using pypandoc without writing to file?


I am trying to take a response from an API that reports back a JSON object and convert that to a markdown output to present it in a more readable format. I am attempting to use pypandoc to do this and I am having a very time figuring out how to get it to work.

I would expect the following to work based on the documentation from pandoc but the pypandoc docs and examples are scarce.

from ipwhois import IPWhois

obj = IPWhois('74.125.225.229')
results = obj.lookup_rdap(depth=1)
print(pypandoc.convert_text(results, 'json', 'md'))



---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-6a5e630f5495> in <module>()
      4 obj = IPWhois('74.125.225.229')
      5 results = obj.lookup_rdap(depth=1)
----> 6 print(pypandoc.convert_text(results, 'json', 'md'))

C:\ProgramData\Anaconda3\lib\site-packages\pypandoc\__init__.py in convert_text(source, to, format, extra_args, encoding, outputfile, filters)
    101     source = _as_unicode(source, encoding)
    102     return _convert_input(source, format, 'string', to, extra_args=extra_args,
--> 103                           outputfile=outputfile, filters=filters)
    104 
    105 

C:\ProgramData\Anaconda3\lib\site-packages\pypandoc\__init__.py in _convert_input(source, format, input_type, to, extra_args, outputfile, filters)
    303 
    304     try:
--> 305         source = cast_bytes(source, encoding='utf-8')
    306     except (UnicodeDecodeError, UnicodeEncodeError):
    307         # assume that it is already a utf-8 encoded string

C:\ProgramData\Anaconda3\lib\site-packages\pypandoc\py3compat.py in cast_bytes(s, encoding)
     37     # bytes == str on py2.7 -> always encode on py2
     38     if not isinstance(s, bytes):
---> 39         return _encode(s, encoding)
     40     return s
     41 

C:\ProgramData\Anaconda3\lib\site-packages\pypandoc\py3compat.py in _encode(u, encoding)
     25 def _encode(u, encoding=None):
     26     encoding = encoding or _DEFAULT_ENCODING
---> 27     return u.encode(encoding)
     28 
     29 

AttributeError: 'dict' object has no attribute 'encode'

If anyone knows a better way to convert this without writing it down to a file I would be very appreciative if you shared it.

writing to and reading from file version:

from ipwhois import IPWhois
from pprint import pprint
import json, pypandoc

obj = IPWhois('74.125.225.229')
results = obj.lookup_rdap(depth=1)
pprint(results)
with open('data.json', 'w') as outfile:
    json.dump(results, outfile)
output = pypandoc.convert_file('data.json', to='json', format='md',outputfile="data.md")

Solution

  • Comment: I should be able to do this in memory

    Please show your code doing it to/from File.
    It should be able to do it in Memory using a StringIO object.

    results is of Type dict, change to str using json.dumps(... should be the same as reading from file:

    print(pypandoc.convert_text(json.dumps(results), 'json', 'md'))
    

    Question: I would expect the following to work based on the documentation from pandoc
    ... convert that to ... in a more readable format.

    You have to convert it by your own, either to md or HTML.
    It's the same effort as the following example:

    Convert JSON to PDF with Python and xtopdf

    This recipe show the basic steps needed to convert JSON input to PDF output, using Python and xtopdf,
    a PDF creation toolkit. xtopdf is itself written in Python, and uses the ReportLab toolkit internally.