Search code examples
bashrename

bash: rename files dropping a specific delimited part of the filename


I've been trying to find an efficient way to rename lots of files, by removing a specific component of the filename, in bash shell in linux. Filenames are like:

DATA_X3.A2022086.40e50s.231.2022087023101.csv

I want to remove the 2nd to last element entirely, resulting in:

DATA_X3.A2022086.40e50s.231.csv

I've seen suggestions to use perl-rename, that might handle this (I'm not clear), but this system does not have perl-rename available. (Has GNU bash 4.2, and rename from util-linux 2.23)


Solution

  • I like extended globbing and parameter parsing for things like this.

    $: shopt -s extglob
    $: n=DATA_X3.A2022086.40e50s.231.2022087023101.csv
    $: echo ${n/.+([0-9]).csv/.csv}
    DATA_X3.A2022086.40e50s.231.csv
    

    So ...

    for f in *.csv; do mv "$f" "${f/.+([0-9]).csv/.csv}"; done
    

    This assumes all the files in the local directory, and no other CSV files with similar formatting you don't want to rename, etc.

    edit

    In the more general case where the .csv is not immediately following the component to be removed, is there a way to drop the nth dot-separated component in the filename? (without a more complicated sequence to string-split in bash (always seems cumbersome) and rebuild the filename?

    There is usually a way. If you know which field needs to be removed -

    $: ( IFS=. read -ra line <<< "$n"; unset line[4]; IFS=".$IFS"; echo "${line[*]}" )
    DATA_X3.A2022086.40e50s.231.csv
    

    Breaking that out:

    (                               # open a subshell to localize IFS
      IFS=. read -ra line <<< "$n"; # inline set IFS to . to parse to fields 
      unset line[4];                # unset the desired field from the array
      IFS=".$IFS";                  # prepend . as the OUTPUT separator
      echo "${line[*]}"             # reference with * to reinsert
    )                               # closing the subshell restores IFS
    

    I will confess I am not certain why the inline setting of IFS doesn't work on the reassembly. /shrug

    This is a simple split/drop-field/reassemble, but I think it may be an X/Y Problem

    If what you are doing is dropping the one field that has the date/timestamp info, then as long as the format of that field is consistent and unique, it's probably easier to use a version of the first approach.

    Is it possible you meant for DATA_X3.A2022086.40e50s.231.2022087023101.csv's 5th field to be 20220807023101? i.e., August 7th of 2022 @ 02:31:01 AM? Because if that's what you mean, and it's supposed to be 14 digits instead of 13, and that is the only field that is always supposed to be exactly 14 digits, then you don't need shopt and can leave the field position floating -

    $: n=DATA_X3.A2022086.40e50s.231.20220807023101.csv
    $: $: echo ${n/.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]./.}
    DATA_X3.A2022086.40e50s.231.csv