Search code examples
shellunixdash-shell

shell - Characters contained in both strings - edited


I want to compare two string variables and print the characters that are the same for both. I'm not really sure how to do this, I was thinking of using comm or diff but I'm not really sure the right parameters to print only matching characters. also they say they take in files and these are strings. Can anyone help?

Input:

a=$(echo "abghrsy")
b=$(echo "cgmnorstuvz")

Output:

"grs"

Solution

  • Use Character Classes with GNU Grep

    The isn't a widely-applicable solution, but it fits your particular use case quite well. The idea is to use the first variable as a character class to match against the second string. For example:

    a='abghrsy'
    b='cgmnorstuvz'
    echo "$b" | grep --only-matching "[$a]" | xargs | tr --delete ' '
    

    This produces grs as you expect. Note that the use of xargs and tr is simply to remove the newlines and spaces from the output; you can certainly handle this some other way if you prefer.

    Set Intersection

    What you're really looking for is a set intersection, though. While you can "wing it" in the shell, you'd be better off using a language like Ruby, Python, or Perl to do this.

    A Ruby One-Liner

    If you need to integrate with an existing shell script, a simple Ruby one-liner that uses Bash variables could be called like this inside your current script:

    a='abghrsy'
    b='cgmnorstuvz'
    ruby -e "puts ('$a'.split(//) & '$b'.split(//)).join"
    

    A Ruby Script

    You could certainly make things more elegant by doing the whole thing in Ruby instead.

    string1_chars = 'abghrsy'.split //
    string2_chars = 'cgmnorstuvz'.split //
    intersection  = string1_chars & string2_chars
    puts intersection.join
    

    This certainly seems more readable and robust to me, but your mileage may vary. At least now you have some options to choose from.