Search code examples
clojure

Comparing two strings and returning the number of matched words


I'm fairly new to Clojure, and in programming, in general. Is there a way I can compare two strings word by word and then return the number of matched words in both strings? Also how can I count the numbers in a string?

Ex: comparing string1 "Hello Alan and Max" and string2 "Hello Alan and Bob" will return "3" (such as Hello Alan and are the words matched in both strings) and finding the number of words in string1 will result in the number 4.

Thank you


Solution

  • Let's break it down into some smaller problems:

    compare two strings word by word

    First we'll need a way to take a string and return its words. One way to do this is to assume any whitespace is separating words, so we can use a regular expression with clojure.string/split:

    (defn string->words [s]
      (clojure.string/split s #"\s+"))
    
    (string->words "Hello world, it's me, Clojure.")
    => ["Hello" "world," "it's" "me," "Clojure."]
    

    return the number of matched words in both strings

    The easiest way I can imagine doing this is to build two sets, one to represent the set of words in both sentences, and finding the intersection of the two sets:

    (set (string->words "a b c a b c d e f"))
    => #{"d" "f" "e" "a" "b" "c"} ;; #{} represents a set
    

    And we can use the clojure.set/intersection function to find the intersection of two sets:

    (defn common-words [a b]
      (let [a (set (string->words a))
            b (set (string->words b))]
        (clojure.set/intersection a b)))
    
    (common-words "say you" "say me")
    => #{"say"}
    

    To get the count of (matching) words, we can use the count function with the output of the above functions:

    (count (common-words "say you" "say me")) ;; => 1