Search code examples
shellunixtext-processing

why empty double quote is coming in file at last record | shell |


I have 10 files which contain one columnar vertical data that i converted to consolidate one file with data in horizontal form

file 1 :

A
B
C
B

file 2 :

P
W
R
S

file 3 :

E
U
C
S

similarly like above their will be remaing files

I consolidated all files using below script

cd /path/

#storing all file names to array_list to club data of all into one file 
array_list=`( awk -F'/' '{print $2}' )`

for i in {array_list[@]}
do 
   sed 's/"/""/g; s/.*/"&"/' /path/$i | paste -s -d, >> /path/consolidate.txt
done 

Output obtained from above script :

"A","B","C","B"
"P","W","R","S",""
"E","U","C","S"

Why the second line as last entry -> "" -> "P","W","R","S",""

when their are only four values in file 2 , it should be : "P","W","R","S"

Is it happening because of empty line in that file 2 at last ?

Solution will be appreciated


Solution

  • I assume it is indeed from an empty line. You could remove such 'mistakes' by updating your script to include sed 's/,""$//' like:

    sed 's/"/""/g; s/.*/"&"/' /path/$i | paste -s -d, | sed 's/,""$//' >> /path/consolidate.txt
    

    Explanation of the above command, piece by piece

    Substitute a double quote for two double quotes (the g option means do this for every match on each line, rather than just the first match):

    sed 's/"/""/g; 
    

    We use a semi-colon to tell sed that we will issue another command. The next substitute command to sed matches the entire line, and replaces it with itself, but surrounded by double quotes (the & represents the matched pattern):

    s/.*/"&"/' 
    

    This is an argument to the above sed command, expanding the variable i in the for loop:

    /path/$i 
    

    The above commands produce some output ('stdout'), which would by default be sent to the terminal. Instead of that, we use it as input ('stdin') to a subsequent command (this is called a 'pipeline'):

    | 
    

    The next command joins the lines of 'stdin' by replacing the newline characters with , delimiters (be default the delimiter would be a tab):

    paste -s -d, 
    

    We pipe the 'stdout' of the last command into another command (continuing the pipeline):

    | 
    

    The next command is another sed, this time substituting any occurrences of ,"" that happen at the end of the line (in sed, $ means end of line) with nothing (in effect deleting the matched patter):

    sed 's/,""$//' 
    

    The output of the above pipeline is appended to our text file (>> appends, whilst > overwrites):

    >> /path/consolidate.txt