I need to delete JPG and jpg files based on the following conditions
Folder has multiple JPG and jpg files. Each file is named for example: 172.30.165.212_20241231_132125.JPG
. Where 172.30.165.212
is the IP address, 20241231
is the date in YYYYMMDD format, and 132125
is the time in HHMMSS format.
The delete conditions are:
1- The script should always keep the most recent file per IP address based on its date/time from filename. No matter how old the date/time is.
2- But, since each IP address can have multiple files, the script should delete all files whose date/time in the filename is more than 2 hours older than the current time.
3- Never look at the file's modification date/time, only the one found in the name.
I have tried this with no luck as files dont get deleted.
#!/bin/bash
# Define target directory
TARGET_DIR="/mnt/moe/results"
# Log start time
echo "[$(date)] Starting cleanup process in $TARGET_DIR"
# Function to process files for a single IP
process_ip_files() {
local ip_prefix=$1
local ip_files
# Find files matching the IP
ip_files=$(find "$TARGET_DIR" -type f -iname "${ip_prefix}_*" | sort)
# Skip if no files
if [[ -z "$ip_files" ]]; then
echo "No files found for IP: $ip_prefix"
return
fi
echo "Processing IP: $ip_prefix"
# Variables to track files and the most recent file
local most_recent_file=""
local most_recent_time=0
local files_to_delete=()
# Get current time in seconds since epoch
current_time=$(date +%s)
# Iterate over files to determine the most recent and deletion criteria
while IFS= read -r file; do
echo "Processing file: $file"
# Remove the path and get just the file name
base_file=$(basename "$file")
echo "Base file name: $base_file"
# Split file name into components
IFS='_' read -r ip date time ext <<< "$base_file"
# Validate the expected number of fields and format
if [[ -z "$ip" || -z "$date" || -z "$time" || "$ext" != "JPG" && "$ext" != "jpg" ]]; then
echo " Skipping file (does not match expected format): $file"
continue
fi
# Check the timestamp format (YYYYMMDD HHMMSS)
if ! [[ "$date" =~ ^[0-9]{8}$ ]] || ! [[ "$time" =~ ^[0-9]{6}$ ]]; then
echo " Skipping file (invalid timestamp format): $file"
continue
fi
# Convert to seconds since epoch
timestamp="$date $time"
file_time=$(date -d "$timestamp" +%s)
echo " File: $file"
echo " Timestamp: $timestamp"
echo " File time (epoch): $file_time"
echo " Current time (epoch): $current_time"
# Check if this file is the most recent one for the IP
if (( file_time > most_recent_time )); then
# If we already have a most recent file, we add it to the delete list
if [[ -n "$most_recent_file" ]]; then
files_to_delete+=("$most_recent_file")
fi
most_recent_file="$file"
most_recent_time="$file_time"
else
# Check if the file is older than 2 hours (7200 seconds)
if (( current_time - file_time > 7200 )); then
echo " Marking for deletion: $file"
files_to_delete+=("$file")
fi
fi
done <<< "$ip_files"
# Display the most recent file for this IP
echo "Most recent file for IP $ip_prefix: $most_recent_file"
# Deleting files not the most recent one
if [[ ${#files_to_delete[@]} -gt 0 ]]; then
echo "Files marked for deletion for IP $ip_prefix:"
for file in "${files_to_delete[@]}"; do
echo " - $file"
done
for file in "${files_to_delete[@]}"; do
if [[ "$file" != "$most_recent_file" ]]; then
echo "Deleting file: $file"
rm -v "$file"
fi
done
else
echo "No files to delete for IP $ip_prefix."
fi
}
# Process unique IP addresses
find "$TARGET_DIR" -type f \( -iname "*.jpg" -o -iname "*.JPG" \) -printf "%f\n" | \
awk -F'_' '{print $1}' | sort -u | while read -r ip; do
process_ip_files "$ip"
done
# Log completion
echo "[$(date)] Cleanup process finished."
As an example, I have the following files and current date/time is 20241231 13:30
172.30.165.212_20241231_132125.JPG
172.30.165.212_20241231_122125.JPG
172.30.165.212_20241231_112125.JPG
172.30.165.212_20241231_102125.JPG
172.30.165.212_20241231_092125.JPG
172.30.165.213_20241231_062125.JPG
172.30.165.213_20241231_032125.JPG
172.30.165.213_20241231_012125.JPG
Script should delete
172.30.165.212_20241231_112125.JPG (older than 2 hours)
172.30.165.212_20241231_102125.JPG (older than 2 hours)
172.30.165.212_20241231_092125.JPG (older than 2 hours)
172.30.165.213_20241231_032125.JPG (older than 2 hours)
172.30.165.213_20241231_012125.JPG (older than 2 hours)
Script should keep
172.30.165.212_20241231_132125.JPG (younger than 2 hours)
172.30.165.212_20241231_122125.JPG (younger than 2 hours)
172.30.165.213_20241231_062125.JPG (older than 2 hours but most recent from this ip address)
Instead of trying to critique a 100+ line script I propose the following alternative:
$ cat delfiles
#!/bin/bash
TARGET_DIR="/mnt/moe/results" # OP's directory; update accordingly (eg, TARGET_DIR='.' in my case)
unset prev_ip
printf -v now "%(%s)T" # get current time in epoch format
now=1735673400 # hardcoded to OP's 'current date' of '2024-12-31 13:30:00';
# otherwise comment/remove this line for normal operations
(( now-=7200 )) # subtract 2 hours
while read -r fname
do
IFS='_' read -r ip dt tm ext <<< "${fname}"
[[ "${ip}" != "${prev_ip}" ]] && { # if new ip then this is the latest file for said ip so ...
prev_ip="${ip}" # save the new ip and ...
continue # skip to next file (ie, keep this file)
}
epoch=$(date -d "${dt:0:4}-${dt:4:2}-${dt:6:2} ${tm:0:2}:${tm:2:2}:${tm:4:2}" '+%s')
(( epoch < now )) && echo rm "${TARGET_DIR}/${fname}" # if file's epoch is more than 2 hrs old then remove the file;
# NOTE: remove the 'echo' to perform the actual deletion
done < <(find "${TARGET_DIR}" -type f -iname '*.jpg' -printf "%f\n" | sort -rV)
NOTES:
find
results by -rV
to sort in r
everse order using a V
ersion sort of the ip + date/timestamp$(date -d ... '+%s')
calls with something like this answer's coproc solutionRunning against OP's file set generates:
$ ./delfiles
rm ./172.30.165.213_20241231_032125.JPG
rm ./172.30.165.213_20241231_012125.JPG
rm ./172.30.165.212_20241231_112125.JPG
rm ./172.30.165.212_20241231_102125.JPG
rm ./172.30.165.212_20241231_092125.JPG