Search code examples
bashrm

how to get basename in -exec of find?


I cannot get the following piece of script (which is part of a larger backup script) to work correctly:

BACKUPDIR=/BACKUP/db01/physical/incremental # Backups base directory
FULLBACKUPDIR=$BACKUPDIR/full # Full backups directory
INCRBACKUPDIR=$BACKUPDIR/incr # Incremental backups directory
KEEP=5 # Number of full backups (and its incrementals) to keep
...
FIRST_DELETE=`expr $KEEP + 1` # add one to the number of backups to keep, this will be the first deleted
FILE0=`ls -ltr $FULLBACKUPDIR | awk '{print $9}' | tail -$FIRST_DELETE | head -1` # search for the first backup to be deleted
...
find $FULLBACKUPDIR -maxdepth 1 -type d ! -newer $FULLBACKUPDIR/$FILE0 -execdir echo "removing: "$FULLBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $FULLBACKUPDIR/$(basename {})' \; -execdir echo "removing: "$INCRBACKUPDIR/$(basename {}) \; -execdir bash -c 'rm -rf $INCRBACKUPDIR/$(basename {})' \;

So the find works correctly which on its own will output something like this:

/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-51-28
/BACKUPS/db01/physical/incremental/full/2013-08-12_17-25-07

What I want is the -exec to echo a line showing what is being removed and then remove the folder from both directories.

I've tried various ways to get just the basename but nothing seems to be working. I get this:

removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/incr/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-51-28"
removing: /BACKUPS/mysql/physical/incremental/full/"/BACKUPS/mysql/physical/incremental/full/2013-08-12_17-25-07"

And of course the folders arn't deleted because they don't exist, just fail silently because of the -f option. If I remove the -f I get the 'cannot be found' error on each rm.

How do I accomplish this? Because backups and parts of backups may be stored across different storage systems I really need the ability to just get the folder name for use in any known path.


Solution

  • Lots of broken here.

    • All caps variables are by convention env vars and should not be used in scripts.
    • Using legacy backticks instead of $()
    • Parsing the output of ls (!)
    • Parsing the output of ls -l (!!!)
    • Expanding variables known to contain paths without full quotes.

    All you absolutely need in order to improve this is to -exec bash properly, e.g.

    -execdir bash -c 'filepath="$1" ; base=$(basename "$filepath") ; echo use $filepath and $base here' -- {} \;
    

    But how about this instead:

    #!/usr/bin/env bash
    
    backup_base=/BACKUP/db01/physical/incremental
    full_backup="$backup_base"/full
    incremental_backup="$backup_base"/incr
    keep=5
    rm=echo
    
    let n=0
    while IFS= read -r -d $'\0' line ; do
        file="${line#* }"
        if [[ $n -lt $keep ]] ; then
                let n=n+1
                continue
        fi
        base=$(basename "$file")
        echo "removing: $full_backup/$base"
        "$rm" -rf -- "$full_backup"/"$base"
        echo "removing: $incremental_backup/$base"
        "$rm" -rf -- "$incremental_backup"/"$base"
    done < <(find "$full_backup" -maxdepth 1 -printf '%T@.%p\0' 2>/dev/null | sort -z -r -n -t. -k1,2)
    

    Iterate over files and directories immediately under the backup dir and skip the first 5 newest. Delete from the full and incremental dirs files matching the names of the rest.

    This is an essentially safe version, except of course for timing attacks.

    I have defined rm as being echo to avoid accidental deletes; swap it back to rm for actual deletion once you're sure it's correct.