Search code examples
regexgitdiffgit-diffgit-config

putting regex to make git diff split words at punctuation into .gitconfig file


current setup

My .gitconfig currently includes this alias:

[alias]
    wdiff = diff --color-words --histogram

to let me write git wdiff and get word-by-word rather than line-by-line diff output. I use this for writing scholarly prose in LaTeX.

goal

This method divides words only at white space. I would like to divide at punctuation marks so that, for example, last word of sentence. changed to last word of sentence.\footnote{New footnote.} produces diff output that looks something like this:

last word of sentence.\footnote{New footnote.}

rather than the current output:

last word of sentence.sentence.\footnote{New footnote.}

(where italics means deletion and bold means addition).

attempted solution

I found this other question that begins with a regex that does exactly what I want in the command line, but I haven't figured out how to put this in my .gitconfig file without producing the error message fatal: bad config line 12 in file /Users/alex/.gitconfig. This is what I put in my .gitconfig file:

[alias]
    wdiff = diff --color-words='[^][<>()\{},.;:?/|\\=+*&^%$#@!~`"'\''[:space:]]+|[][<>(){},.;:?/|\\=+*&^%$#@!~`"'\'']' --histogram

The problem seems to be the semicolon.

A different question that deals with a similar problem in .gitconfig suggested putting double-quotes around an entire alias. But when I do that in my case, I get the same error message. I think this is because the regex also includes double-quotes.

question

How can I put the regex into my .gitconfig file such that it can be properly parsed?


Solution

  • I was confused as well until I found this page of documentation. The part you are interested in is:

    A line that defines a value can be continued to the next line by ending it with a \; the backslash and the end-of-line are stripped. Leading whitespaces after name =, the remainder of the line after the first comment character # or ;, and trailing whitespaces of the line are discarded unless they are enclosed in double quotes. Internal whitespaces within the value are retained verbatim.

    Inside double quotes, double quote " and backslash \ characters must be escaped: use \" for " and \\ for \.

    The following escape sequences (beside \" and \\) are recognized: \n for newline character (NL), \t for horizontal tabulation (HT, TAB) and \b for backspace (BS). Other char escape sequences (including octal escape sequences) are invalid.

    So, here the correct alias in .git/config:

    wdiff = "diff --color-words='[^][<>()\\{},.;:?/|\\\\=+*&^%$#@!~`\"'\\''[:space:]]+|[][<>(){},.;:?/|\\\\=+*&^%$#@!~`\"'\\'']' --histogram"
    

    In this case you just need to enclose everything in double quotes and escape both " and backslashes.