I have python code like that has this kind of structure
def main:
''' comment '''
if True:
print "do"
print "done
This code is not compatible with the interactive-mode (for example if I copy/paste it in an interactive session). For this it would need to be :
def main:
''' comment '''
if True:
print "do"
print "done"
otherwise the interactive mode breaks on Indentation problems.
Do you know a simple way to transform the code with the generate_token / untokenize chain ? I am a bit lost in the NL / NEWLINE / INDENT / DEDENT semantics.
I found this Script to remove Python comments/docstrings that removes comments/docstrings. It looks like a perfect fit for my problem but it cannot sort it out to have a clean output on complex code.
the best I could come up with (resolved my issue)
def _python_interactive_indent(self, code):
prev_toktype = tokenize.INDENT
first_line = None
last_lineno = -1
last_col = 0
output = ''
tokgen = tokenize.generate_tokens(StringIO.StringIO(code).readline)
indent = 0
hasNL = False
prefixed = False
for toktype, ttext, (slineno, scol), (elineno, ecol), ltext in tokgen:
done = False
if toktype == tokenize.INDENT:
indent = indent + 1
if toktype == tokenize.DEDENT:
indent = indent - 1
if slineno > last_lineno:
last_col = 0
if not done and toktype == tokenize.NL:
hasNL = True
done = True
if not done and toktype == tokenize.COMMENT:
done = True
if not done and toktype == tokenize.STRING and prev_toktype == tokenize.INDENT:
done = True
if not done and hasNL and toktype != tokenize.DEDENT and toktype != tokenize.INDENT:
hasNL = False
output = output + (" " * indent) + '\n'
output += " " * indent
prefixed = True
if not done:
if not prefixed and scol > last_col:
output += (" " * (scol - last_col))
output += (ttext)
prefixed = False
prev_toktype = toktype
last_col = ecol
last_lineno = elineno
return output