Search code examples
jsonpython-3.xpandastqdm

tqdm progress bar with json string stuck


I have a list of json strings, and I'm converting them to a list of dicts.

I do that to combine them in one final json string, to be converted later to a Pandas Dataframe:

s1 = '{ "id": 11, "label": "REF", "claim": "Lorelai Gilmore", "ce": [[[1,2, "Gilmore", 3]]]}'
s2 = '{ "id": 0, "label": "REF", "claim": "named Robert s.", "ce": [[[1,2, "Lorelai", 3]]]}'
s = [s1, s2]

combine = [json.loads(item) for item in s]

r = json.dumps(combine, indent=2)
s = pandas.read_json(r)
print(s)

The list of the json strings that I have is very large, therefore I tried to use Tqdm progress bar to monitor the progress:

combine = tqdm([json.loads(item) for item in s])

but I got this error:

  0%|          | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
  File "D:/OneDrive/PhD/fever_challenge/test.py", line 11, in <module>
    r = json.dumps(combine, indent=2)
  File "C:\Python35\lib\json\__init__.py", line 237, in dumps
    **kw).encode(obj)
  File "C:\Python35\lib\json\encoder.py", line 200, in encode
    chunks = list(chunks)
  File "C:\Python35\lib\json\encoder.py", line 436, in _iterencode
    o = _default(o)
  File "C:\Python35\lib\json\encoder.py", line 179, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError:   0%|          | 0/2 [00:00<?, ?it/s] is not JSON serializable

I removed the last three lines in my code when I was tracing the error, and I observed that the loop is stuck. This is what appeared to me:

  0%|          | 0/2 [00:00<?, ?it/s]

What is the problem in my code?


Solution

  • Try this:

    combine = []
    for i in tqdm([json.loads(item) for item in s]):
        combine.append(i)