I'm reading from CAT pipe in Linux, using subprocess
:
stdout=subprocess.PIPE
so some line has BAD EOL, it's huge file and I want to skip such lines and go for the next one. how I can do this in Python?
PS: I always get:
SyntaxError: EOL while scanning string literal
and seems some socket stopped while writing to that file,because I see really huge spaces in the end of that file. Don't want to fix it, want to skip it
here is my code :
import sys,os
import subprocess
import traceback
import re
import ast
try :
cat = subprocess.Popen(["hadoop", "dfs", "-cat", "PATH TO FILE"], stdout=subprocess.PIPE)
for data in cat.stdout:
data = re.sub(' +',' ',data)
msg= ast.literal_eval(data)
if msg['some_string'] == 'some_string' :
print msg['status']
else :
continue
except :
print traceback.format_exc()
pass
exit()
so the output before the programs exits : many empty spaces and ...
^
SyntaxError: EOL while scanning string literal
Here, try this:
import sys,os
import subprocess
import traceback
import re
import ast
try :
cat = subprocess.Popen(["hadoop", "dfs", "-cat", "PATH TO FILE"], stdout=subprocess.PIPE)
for data in cat.stdout:
data = re.sub(' +',' ',data)
try:
msg= ast.literal_eval(data)
if msg['some_string'] == 'some_string' :
print msg['status']
else :
continue
except SyntaxError:
continue #skip this line
except :
print traceback.format_exc()
pass
exit()
Hope it helps!