Search code examples
language-agnosticawkcode-golf

Automatically sum numeric columns and print total


Given the output of git ... --stat:

 3 files changed, 72 insertions(+), 21 deletions(-)
 3 files changed, 27 insertions(+), 4 deletions(-)
 4 files changed, 164 insertions(+), 0 deletions(-)
 9 files changed, 395 insertions(+), 0 deletions(-)
 1 files changed, 3 insertions(+), 2 deletions(-)
 1 files changed, 1 insertions(+), 1 deletions(-)
 2 files changed, 57 insertions(+), 0 deletions(-)
 10 files changed, 189 insertions(+), 230 deletions(-)
 3 files changed, 111 insertions(+), 0 deletions(-)
 8 files changed, 61 insertions(+), 80 deletions(-)

I wanted to produce the sum of the numeric columns but preserve the formatting of the line. In the interest of generality, I produced this awk script that automatically sums any numeric columns and produces a summary line:

{
    for (i = 1; i <= NF; ++i) {
        if ($i + 0 != 0) {
            numeric[i] = 1;
            total[i] += $i;
        }
    }
}
END {
    # re-use non-numeric columns of last line
    for (i = 1; i <= NF; ++i) {
        if (numeric[i])
            $i = total[i]
    }
    print
}

Yielding:

 44 files changed, 1080 insertions(+), 338 deletions(-)

Awk has several features that simplify the problem, like automatic string->number conversion, all arrays as associative arrays, and the ability to overwrite auto-split positional parameters and then print the equivalent lines.

Is there a better language for this hack?


Solution

  • Perl - 47 char

    Inspired by ChristopheD's awk solution. Used with the -an command-line switch. 43 chars + 4 chars for the command-line switch:

    $i-=@a=map{($b[$i++]+=$_)||$_}@F}{print"@a"
    

    I can get it to 45 (41 + -ap switch) with a little bit of cheating:

    $i=0;$_="Ctrl-M@{[map{($b[$i++]+=$_)||$_}@F]}"

    Older, hash-based 66 char solution:

    @a=(),s#(\d+)(\D+)#$b{$a[@a]=$2}+=$1#gefor<>;print map$b{$_}.$_,@a