Search code examples
pythonmethodshashcode

Create a hash of a method in python source


I am facing a problem where we need to keep track of certain python methods in third party code to see if they have changed.

We can't hash the entire file, because there may be all sorts of unrelated changes.

So, I have no problem writing a process which, when called, will provide a file name with path, and a class name, and a method name.

I need some pointers how to read just that method out - obviously the line numbers cant be relied upon - and then I can create a hash and store it.

I cannot seem to find any way to "locate method X in class Y in a python .py file"

Note this source being scanned may not even be pathed to, so I cannot find or analyise the classes from within - I need a function which can analyse the source without opening it (it is a library of files which I have not even pathed to).


Solution

  • Thanks everyone, I explored your options and had success, generally.

    In the end, though, because the environment we are executing in is a heavily modified environment (it is the Odoo framework) I encountered issues like modified loader, which complained about the way it was specced, and so on.

    Ultimately, I anticipated these problems, which is why I was looking for a solution which did not load the file, as a module.

    This is what I ended up with...

    import ast
    
    def create_hash(self, fname, class_name, method_name):
        source_item = False
        next_item = False
        with open(fname) as f:
            tree = ast.parse(f.read(), filename=fname)
            for item in tree.body:
                if source_item:
                    next_item = item
                    break
                if item.__class__.__name__ == 'ClassDef' and item.name == class_name:
                    for subitem in item.body:
                        if source_item:
                            next_item = subitem
                            break
                        if subitem.__class__.__name__ == 'FunctionDef' and subitem.name == method_name:
                            source_item = subitem
                    if next_item:
                        break
    
        assert source_item, 'Unable to find method %s on %s' % (method_name, class_name)
        from_line = min(
            [source_item.lineno]
            + (hasattr(source_item, 'decorator_list') and [d.lineno for d in source_item.decorator_list] or [])
        )
        to_line = next_item and min(
            [next_item.lineno]
            + (hasattr(next_item, 'decorator_list') and [d.lineno for d in next_item.decorator_list] or [])
        ) - 1 or False
    
        with open(fname) as f:
            if to_line:
                code = lines[from_line - 1:to_line]
            else:
                code = lines[from_line - 1:]
    
        hash_object = hashlib.sha256(bytes(''.join(code), 'utf-8'))
        hexdigest = hash_object.hexdigest()
        return hexdigest
    

    Edit:

    It seems that different versions of AST change the function's "lineno" - in older versions it was the min of the def and the decorator - in newer versions it is the line of the def.

    So I have changed the code to allow for both implementations....