Search code examples
pythonabstract-syntax-tree

ast.literal_eval not working as part of list comprehension (when reading a file)


I am trying to parse a file which has pairs of lines, each of them representing a list of integers or other lists. Example data from the file:

[[[6,10],[4,3,[4]]]]
[[4,3,[[4,9,9,7]]]]

[[6,[[3,10],[],[],2,10],[[6,8,4,2]]],[]]
[[6,[],[2,[6,2],5]]]

I am trying to read the file into a list of tuples of data-structures (nested lists) with the following statement:

with open("filename","r") as fp:
    pairs = [tuple(ast.literal_eval(l.strip()) for l in lines.split("\n")) for lines in fp.read().split("\n\n")]

This failed with below stacktrace, leading me to believe that the data was somewhere corrupt (unmatched brackets or something similar):

Traceback (most recent call last):
  File "program.py", line 5, in <module>
    pairs = [tuple(ast.literal_eval(l.strip()) for l in lines.split("\n")) for lines in fp.read().split("\n\n")]
  File "program.py", line 5, in <listcomp> 
    pairs = [tuple(ast.literal_eval(l.strip()) for l in lines.split("\n")) for lines in fp.read().split("\n\n")]
  File "program.py", line 5, in <genexpr>  
    pairs = [tuple(ast.literal_eval(l.strip()) for l in lines.split("\n")) for lines in fp.read().split("\n\n")]
  File "C:\Python39\lib\ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "C:\Python39\lib\ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 0

SyntaxError: unexpected EOF while parsing

So I cut down the program into manual loops and the problem was not reproducible any more. So the below code, which first reads into a list of tuples of strings and then evaluating the strings with ast.literal_eval works fine. The above "doing-it-all-at-once" still fails with the same error.

# This works:
with open("filename","r") as fp:
    stringpairs = [tuple(l.strip() for l in lines.split("\n")) for lines in fp.read().split("\n\n")]
pairs = [tuple(ast.literal_eval(pair[i]) for i in range(2)) for pair in stringpairs]

# This still doesn't work:
with open("filename","r") as fp:
    pairs = [tuple(ast.literal_eval(l.strip()) for l in lines.split("\n")) for lines in fp.read().split("\n\n")]


Solution

  • The problem is that fp.read().split("\n\n") is leaving the final newline at the end of the last pair of lines.

    Then when you do lines.split('\n') you get 3 lines in the last group, not just 2; the last line in this group is empty, and ast.literal_eval('') gets an error.

    So strip this newline off before calling lines.split('\n').

    pairs = [tuple(ast.literal_eval(l.strip()) 
             for l in lines.strip().split("\n")) 
             for lines in fp.read().split("\n\n")]