Search code examples
pythontokenizeabstract-syntax-tree

Remove If block with FunctionDef keeping the untouched code as it is


I need to replace the if-else section with a function enclosing the same if-else. For ex:

Following is the code with if-else condition

x = 5 # comments to be retained

# above new line as well
if foo() == 'bar':
   y = 10
   print('foo is bar')
else:
   print('foo is not bar')
z =100

Now I want to change this

x = 5 # comments to be retained

# above new line as well
def if_encapsulated():
   if foo() == 'bar':
      y = 10
      print('foo is bar')
   else:
      print('foo is not bar')
if_encapsulated()
z = 100

I am using ast to parse the code and using ast.NodeTransformer to replace if using ast.FunctionDef but when I use ast.unparse/astor.code_gen.to_source it doesn't keep track of tabs, newline and remove comments. I want to replace if block and keeping the rest of the code as it is.

Here is the code for parsing and replacing if:

import ast
import astor
class IfReplacer(ast.NodeTransformer):
    def visit_If(self, if_node):
        function_node = ast.FunctionDef(name = "if_encapsulated", args=ast.arguments(posonlyargs=[], args=[], kwonlyargs=[], defaults=[]), body=[if_node], decorator_list=[], returns = None)
        return function_node

code = """

x = 5 # comments to be retained

# above new line as well
if foo() == 'bar':
   y = 10
   print('foo is bar')
else:
   print('foo is not bar')
z =100
"""
tree = ast.parse(code)
IfReplacer().visit(tree)
print(astor.code_gen.to_source(tree))
# ast.unparse(tree) doesn't work for FunctionDef as it raise error - no attribute called lineno 

Outputs:

x = 5


def if_encapsulated():
    if foo() == 'bar':
        y = 10
        print('foo is bar')
    else:
        print('foo is not bar')


z = 100

I have tried using asttokens library which seems to resolve the such issue but when I use I faced another issue. And also explored astor again and faced some issues. Please help here.


Solution

  • The ast module does not preserve comments and original formatting. Instead, you can use a library such as libcst to generate an abstract syntax tree that does include comments, etc:

    import libcst
    import libcst._nodes as ln
    def create_node(ast, **kwargs):
       return type(ast)(**{**{i:getattr(ast, i) for i in ast.__slots__}, **kwargs})
    
    def make_updates(ast):
       if hasattr(ast, 'body'):
           body = []
           for i in ast.body:
              if isinstance(i, ln.statement.If) and i.orelse is not None:
                 body.append(ln.statement.FunctionDef(
                     name = ln.expression.Name(value = 'if_encapsulated'),
                     params = ln.expression.Parameters(params = []),
                     body = ln.statement.IndentedBlock(body = [create_node(i, leading_lines = [])]),
                     leading_lines = i.leading_lines))
                 body.append(ln.statement.SimpleStatementLine(body = [
                    ln.statement.Expr(value = ln.expression.Call(
                       func = ln.expression.Name(value = 'if_encapsulated')))]))
              else:
                 body.append(i)        
           return create_node(ast, body = body)
       return ast
    
    code = """
    x = 5 # comments to be retained
    
    # above new line as well
    if foo() == 'bar':
       y = 10
       print('foo is bar')
    else:
       print('foo is not bar')
    z =100
    """
    
    lib_ast = libcst.parse_module(code)
    updated_ast = make_updates(lib_ast)
    print(updated_ast.code)
    

    Output:

    x = 5 # comments to be retained
    
    # above new line as well
    def if_encapsulated():
       if foo() == 'bar':
          y = 10
          print('foo is bar')
       else:
          print('foo is not bar')
    if_encapsulated()
    z =100