Search code examples
unixquotasdu

Is there a standard way to diff du outputs to detect where disk space usage has grown the most


I work with a small team of developers where we share a unix file system to store somewhat large datasets. This file system has a somewhat prohibitive quota on it so about once a month we have to figure out where our free space has gone and see what we can recover.

Obviously we use du a fair amount but this is still a tedious process. I had the thought that we may be able to keep last months du output around and compare it to this months to see where we've had the most growth. My guess this plan isn't very original.

With this in mind I am asking if there are any scripts out there that already do this.

Thanks.


Solution

  • I really don't know if there is a standard way but I need it sometime ago and I wrote a small perl script to handle that. Here is the part of my code:

    #!/usr/bin/perl
    
    $FileName = "du-previous";
    $Location = ">";
    $Sizes;
    
    # Current +++++++++++++++++++++++++++++
    $Current = `du "$Location"`;
    open my $CurrentFile, '<', \$Current;
    while (<$CurrentFile>) {
        chomp;
        if (/^([0-9]+)[ \t]+(.*)$/) {
            $Sizes{$2} = $1;
        }
    }
    close($CurrentFile);
    
    # Previous ++++++++++++++++++++++++++++
    open(FILE, $FileName);
    while (<FILE>) {
        chomp;
        if (/^([0-9]+)[ \t]+(.*)$/) {
            my $Size = $Sizes{$2};
            $Sizes{$2} = $Size - $1;
        }
    }
    close(FILE);
    
    # Show result +++++++++++++++++++++++++
    SHOW: while (($key, $value) = each(%Sizes)) {
        if ($value == 0) {
            next SHOW;
        }
    
        printf("%-10d %s\n", $value, $key);
    }
    close(FILE);
    
    #Save Current +++++++++++++++++++++++++
    open my $CurrentFile, '<', \$Current;
    open(FILE, ">$FileName");
    while (<$CurrentFile>) {
        chomp;
        print FILE $_."\n";
    }
    close($CurrentFile);
    close(FILE);
    

    The code is not very error-tolerant so you may adjust it.

    Basically the code, get the current disk usage information, compare the size with the lastest time it run (saved in 'du-previous'), print the different and save the current usage information.

    If you like it, take it.

    Hope this helps.