Search code examples
bashshelllogicglob

RegEx in Shell script concatenates RegEx filter into filename


I want to detect if a file with a certain datestamp in the name is in a directory. However there is also a timestamp concatenated to it. I can calculate the datestamp but not the timestamp so i need some kind of a wildcard.

This is the example filename: backup_myserver_202310152100.log This is my expression: log_file="backup_${hostname_arg}_${logfile_sunday}[0-9]*.log" but it searches actually for this filename (checked by echo): backup_myserver_20231015[0-9]*.log

The timestamp is always represented by for digits if it helps.

I tried to get the solution from the www and other questions here but it seems i'm just not made for RegEx.

I tried the expression above as well the following: log_file="backup_${hostname_arg}_${logfile_sunday}{....}.log" since i read the one point does represent one character. Sources are very divers in answers.


Solution

  • When assigning the variable it is the pattern itself (backup_myserver_20231015[0-9]*.log) that is stored in your variable, not its expansion to existing file names, because 1) you double-quoted it (pathname expansion is not done between quotes) and 2) pathname expansion is never done in the right hand side of parameter assignments.

    A simple way to verify all this is to type declare -p log_file to see what log_file is and what value it has. You can also type echo "$log_file" which prints the pattern and echo $log_file which prints the file names if they exist because $log_file is not quoted and the pathname expansion is performed before echo is executed.

    A better way is to assign an array, with the unquoted pattern, and the nullglob option on, such that the pattern expands as the empty string if there are no matching logfiles (output prefixed with -| ):

    shopt -s nullglob
    log_file=( backup_${hostname_arg}_${logfile_sunday}[0-9]*.log )
    if (( ${#log_file[@]} == 0 )); then
      printf 'no log files\n'
    else
      printf '%s\n' "${log_file[@]}"
    fi
    -| backup_myserver_202310152100.log
    -| backup_myserver_202310152101.log
    ...
    

    And if you want to match exactly 4 digits you can try:

    $ log_file=( backup_${hostname_arg}_${logfile_sunday}[0-2][0-9][0-5][0-9].log )
    

    You could also use the find from GNU findutils, if you have that, with regular expressions instead of the simpler glob patterns:

    find . -maxdepth 1 -regextype gnu-awk -type f -regex "\./backup_${hostname_arg}_${logfile_sunday}[0-9]{4}\.log"