Search code examples
xmlbashshellcsvsed

Edit folder and xml files at once with comma delimited csv using sed


I'm trying to write a Bash script to change an XML file name and contents within this XML file for over 500 different folders. A member of the community helped with the first iteration of this wonderful script.

I'd prefer to use the sed command for this process with a Bash script instead of using xmlstarlet as I don't want to install it on this VM. The directory is a little annoying as I'm trying to change a few things at the same time.

This is an example of a file to be modified:

/var/opt/FTPserver/users/MainUsers/junk/VFS/junk2.xml
                                   ^^^^     ^^^^^

With junk2.xml containing a line:

<url>file://home/FTPserver/Customer/junk2/</url>
                                    ^^^^^

I would like to rename the file to the following:

/var/opt/FTPserver/users/MainUsers/junk/VFS/treasure.xml
                                   ^^^^     ^^^^^^^^

With newly named treasure.xml containing the modified line:

<url>file://home/FTPserver/Customer/treasure/</url>
                                    ^^^^^^^^

The customerlogin.csv is comma delimited with column oldStr = junk and column midStr = junk2 and column newStr = treasure as my test example.

CSV Example

junk,junk2,treasure
help,helpful,helping
old,middle,new
dog,dog,cat
dir='/var/opt/FTPserver/users/MainUsers/' 
while IFS=, read -r oldStr midStr newStr; do 
       oldFile="$dir/$oldStr/VFS/${midStr}.xml" 
       newFile="$dir/$oldStr/VFS/${newStr}.xml" 

    if [[ -f "$oldFile" ]] && [[ ! -f "$newFile" ]]; 
    then
        sed "s:/$midStr/:/$newStr/:" "$oldFile" > "$newFile" && 
        rm -f "$oldFile" 
    fi 
done < customerlogin.csv

When I run the script, it works exactly how I would like if all three columns are different names. However, if Column A (oldStr) and Column B (midStr) happen to have the same value such as (dog) & (dog), the XML file name will change to cat but the XML contents remain unchanged as dog.

Result I recieve if (oldStr) value equals (midStr)

/var/opt/FTPserver/users/MainUsers/dog/VFS/cat.xml
                                   ^^^     ^^^
<url>file://home/FTPserver/Customer/dog</url>
                                    ^^^

Result I recieve if (oldStr) value does not equal (midStr)

/var/opt/FTPserver/users/MainUsers/help/VFS/helpful.xml
                                   ^^^^     ^^^^^^^
<url>file://home/FTPserver/Customer/helping</url>
                                    ^^^^^^^

I'm not sure what is causing this and how I would go about fixing it.


Solution

  • Your regexp is looking for /dog/ but your XML files have /dog< instead (note the absence of a / between /dog and </url>):

    <url>file://home/FTPserver/Customer/dog</url>
    

    so that regexp won't match regardless of whether $oldStr == $midStr or not (and so it's not true that helpful is becoming helping in the file as you show).

    In your previous question when trying to match /junk/ you had input files that looked like (note the / between /junk and </url>):

    <url>file://home/FTPserver/Customer/junk/</url>
    

    so if your current XML files followed that same format now then the file would look like (note the / now between /dog and </url>):

    <url>file://home/FTPserver/Customer/dog/</url>
    

    so /dog/ would be present, match the regexp, and be replaced.

    If there may or may not be a / before the <, but there's always the <, in the files you want to modify then change this:

    sed "s:/$midStr/:/$newStr/:"
    

    to this, using GNU or BSD sed for -E:

    sed -E "s:/$midStr(/?<):/$newStr\1:"