Search code examples
arraysstringbashcomm

Check if each element of an array is present in a string in bash, ignoring certain characters and order


On the web I found answers to find if an element of array is present in the string. But I want to find if each element in the array is present in the string.

eg. str1 = "This_is_a_big_sentence"

Initially str2 was like

str2 = "Sentence_This_big"

Now I wanted to search if string str1 contains "sentence"&"this"&"big" (All 3, ignore alphabetic order and case)

So I used arr=(${str2//_/ }) How do i proceed now, I know comm command finds intersection, but it needs a sorted list, also I need to ignore _ underscores.

I get my str2 by finding the extension of a particular type of file using the command

    for i in `ls snooze.*`; do echo    $i | cut -d "." -f2 
# Till here i get str2 and need to check as mentioned above. Not sure how to do this, i tried putting str2 as array and now just need to check if all elements of my array occur in str1 (ignore case,order)

Any help would be highly appreciated. I did try to use This link


Solution

  • Now I wanted to search if string a contains "sentence"&"this"&"big" (All 3, ignore alphabatic order and case)

    Here is one approach:

    #!/bin/bash
    str1="This_is_a_big_sentence"
    str2="Sentence_This_big"
    if ! grep -qvwFf <(sed 's/_/\n/g' <<<${str1,,}) <(sed 's/_/\n/g' <<<${str2,,})
    then
        echo "All words present"
    else
        echo "Some words missing"
    fi
    

    How it works

    • ${str1,,} returns the string str1 with all capitals replaced by lower case.

    • sed 's/_/\n/g' <<<${str1,,} returns the string str1, all converted to lower case and with underlines replaced by new lines so that each word is on a new line.

    • <(sed 's/_/\n/g' <<<${str1,,}) returns a file-like object containing all the words in str1, each word lower case and on a separate line.

      The creation of file-like objects is called process substitution. It allows us, in this case, to treat the output of a shell command as if it were a file to read.

    • <(sed 's/_/\n/g' <<<${str2,,}) does the same for str2.

    • Assuming that file1 and file2 each have one word per line, grep -vwFf file1 file2 removes from file2 every occurrence of a word in file2. If there are no words left, that means that every word in file2 appears in file1.

      By adding the option -q, grep will return no output but will set an exit code that we can use in our if statement.

      In the actual command, file1 and file2 are replaced by our file-like objects.

      The remaining grep options can be understood as follows:

      • -w tells grep to look for whole words only.

      • -F tells grep to look for fixed strings, not regular expressions.

      • -f tells grep to look for the patterns to match in the file (or file-like object) which follows.

      • -v tells grep to remove (the default is to keep) the words which match.