Search code examples

Is it possible to diff PowerPoint version-controlled with git?

I have some PowerPoint documents that I keep version-controlled with git. I want to know what differences are between versions of a file. Text is most important, images and formatting not so much (at least not at this point).


  • I wrote this for use with git on the command-line (requires Python and the python-pptx library):

    Setup -- Add these lines to the following files:
    --- .gitattributes
    *.pptx diff=pptx
    --- .gitconfig (or repo\.git\config    or your_user_home\.gitconfig) (change the path to point to your local copy of the script)
    [diff "pptx"]
        binary = true
        textconv = python C:/Python27/Scripts/
    git diff your_powerpoint.pptx
    Thanks to the  python-pptx docs and this snippet:
    import sys
    from pptx import Presentation
    if __name__ == '__main__':
        if len(sys.argv) != 2:
            print "Usage: git-pptx-textconv file.xslx"
        path_to_presentation = sys.argv[1]
        prs = Presentation(path_to_presentation)
        for slide in prs.slides:
            for shape in slide.shapes:
                if not shape.has_text_frame:
                for paragraph in shape.text_frame.paragraphs:
                    par_text = ''
                    for run in paragraph.runs:
                        s = run.text
                        s = s.replace(r"\\", "\\\\")
                        s = s.replace(r"\n", " ")
                        s = s.replace(r"\r", " ")
                        s = s.replace(r"\t", " ")
                        s = s.rstrip('\r\n')
                        # Convert left and right-hand quotes from Unicode to ASCII
                        # found
                        # go here if more power is needed
                        # or here                
                        punctuation = { 0x2018:0x27, 0x2019:0x27, 0x201C:0x22, 0x201D:0x22 }
                        s.translate(punctuation).encode('ascii', 'ignore')
                        s = s.encode('utf-8')
                        if s:
                            par_text += s
                    print par_text