Search code examples
pythongitgit-diff

How to extract (c)hunk headers only from diff?


The results of diff will give you :

    @@ -74,6 +73,7 @@ 
    <dependency> <groupId>com.jolbox</groupId> 
    <artifactId>bonecp-test
    - commons</artifactId> 
    + <classifier>${project.classifier}</classifier>     
    <scope>test</scope>

I want to way to get only these numbers from the headers 74,6 and 73,7 ; any idea how can I achieve that? I am using python code


Solution

  • You could use regex to extract what you want

    ([-\+]?\d+,[-\+]?\d+)
    

    https://regex101.com/r/1DPH5e/2

    If you want to make it more specific in case your diff includes similar patterns:

    @@\s([-\+]?\d+,[-\+]?\d+)\s([-\+]?\d+,[-\+]?\d+)
    

    https://regex101.com/r/3G42Lz/2

    You can then extract the groups from python. This answer has a good description of how to do that.