how to detect the differences between 2 files and repeat the comparison when special character is found?

I need to detect ALL the differences between 2 files and repeat the comparison when special character is found, and print them in 3rd file.

If file1 is:

a
b
c
d

and file2 is:

1:
b
d
--
2:
a
--
3:
c
a

then the expected output is:

1:
a
c
--
2:
b
c
d
--
3:
b
d

do you have any suggestion? everything i tried typed the difference of 1 file not both.

my code:

#!/bin/bash

file1=file1
file2=file2
output_file=Filee
#!/bin/bash

# Compare the files and store the differences in a temporary file
diff_file=$(mktemp)
diff --changed-group-format='%<' --unchanged-group-format='' "$file1" "$file2" > "$diff_file"

# Process the differences and write them to the output file
group_number=1
current_group=""
while IFS= read -r line; do
    if [[ $line == -- ]]; then
        if [[ -n $current_group ]]; then
            echo "--" >> "$output_file"
            ((group_number++))
        fi
    else
        if [[ -z $current_group ]]; then
            echo "$group_number:" >> "$output_file"
        fi
        echo "$line" >> "$output_file"
        current_group=$group_number
    fi
done < "$diff_file"

# Remove the temporary file
rm "$diff_file"

echo "Comparison completed. Results saved to $output_file"

Solution

I'm looking forward to your solution. I wonder how you attempted to solve it :-)

I've made something that uses regex and works for strings (without spaces) as well. It uses regex to search patterns and skips hits from file2. The script has comments that explain it.

#! /bin/bash

# Disable history, aka `!`
set +H


readonly ID=(1 2 3)
readonly FILE_IN=file1
readonly FILE_TST=file2
readonly FILE_OUT=file3

get() {
        local id=${1?No ID given to ${FUNCNAME[0]}}
        local d=${2?No delimiter given to ${FUNCNAME[0]}}
        local f=${3?No file given to ${FUNCNAME[0]}}
        # ^                     - start of line (file due to -0 option)
        # (.|\n)*$id:\n         - Skip everything until $id: is found
        # (?:(?!$d)(.|\n))*     - Match everything until delimiter
        # ($d|$)                - Delimiter or end of line (file in this case)
        # (.|\n)*               - The rest (if any)
        perl -0pe "s/^(.|\n)*$id:\n((?:(?!$d)(.|\n))*)($d|$)(.|\n)*/\2\n/" $f
}

# Truncate file
echo -ne "" > ${FILE_OUT}
for i in ${ID[@]}; do

        # Grep data from "get"
        data=($(get ${i} '--' "${FILE_TST}" ))

        # Build regex to delete entry per entry in file instead
        # of one long regexp.
        data="$(echo ${data[@]} | tr ' ' '\n'; echo)"
        regxp="$(sed -r 's/^/\//; s/$/\/d;/' <<< "${data}")"
        
        # Populate file
        cat >> ${FILE_OUT} <<-EOF
${i}:
$(sed -r "${regxp}" ${FILE_IN})
EOF
done

# Show output
echo "File ${FILE_OUT} finished"

exit 0

file1

$ cat file1 
ab
bc
cd
de

file2

1:
bc
de
--
2:
ab
--
3:
cd
ab

file3

1:
ab
cd
2:
bc
cd
de
3:
bc
de

p.s. I'd rather have an up-vote. That silver bash badge is coming my way \0/