Search code examples
bashtextwhile-loopline

Bash: Read a Text till end and split it on basis of a word


I have a text like

RECORD = ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ; END ijj%%% klfklk @ ; kjkjkdjjd 333 ; 
jkjdkj@kjk.com ; END kjkjlsj ; popo ; END

I want to split it up in 3 sentences. The splitting should be done on basis of reading till start till 'END' is encountered, then reading till next 'END' is encountered, and so on till end of the text

split 1 = ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ;
split 2 = ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ;
split 3 = kjkjlsj ; popo ;

Code I used is not able to utilize END . Can you please suggest ?

do
echo "$RECORD" | while read line
#Further processing on each of the split sentences

email=$(echo "$line" | awk -F ';' '{print $1}')
subject=$(echo "$line" | awk -F ';' '{print $2}')
body=$(echo "$line" | awk -F ';' '{print $3}')

echo "$body" | mail -s "$subject"  'sjhs@gmail.com';
#Further processing on each of the split sentences

done
fi

Solution

  • Consider:

    $ s="ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ; END ijj%%% klfklk @ ; kjkjkdjjd 333 ; 
    jkjdkj@kjk.com ; END kjkjlsj ; popo ; END" 
    
    $ echo "$s" | tr -d '\n' | awk 'BEGIN{RS=" END ?"}1'
    ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ; 
    ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ; 
    kjkjlsj ; popo ;
    

    Turning a string delimited by ' END ' into a string delimited by \n can then be used in a bash loop for whatever you need to do to those substrings:

    cnt=1
    while read -r line; do
        printf "Line %s: %s\n" "$cnt" "'$line'"
        (( cnt++ ))
    done <<<$(echo "$s" | tr -d '\n' | awk 'BEGIN{RS=" END ?"}1')
    

    Prints:

    Line 1: 'ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ;'
    Line 2: 'ijj%%% klfklk @ ; kjkjkdjjd 333 ; jkjdkj@kjk.com ;'
    Line 3: 'kjkjlsj ; popo ;'
    

    If you want pure Bash, you could do:

    delimit=" END "
    s="${s//$'\n'}"
    ss=$s$delimit
    array=();
    while [[ $ss ]]; do
        array+=( "${ss%%"$delimit"*}" );
        ss=${ss#*"$delimit"};
    done;