Search code examples
filterclojurecompare

Filter matching strings between two text files in Clojure


The text files have a list of paths with a different prefix.

Lets say before.txt looks like this:

before/pictures/img1.jpeg
before/pictures/img2.jpeg
before/pictures/img3.jpeg

and after.txt looks like this:

after/pictures/img1.jpeg
after/pictures/img3.jpeg

The function deleted-files should remove the different prefix (before, after) and compare the two files to print the missing list of after.txt.

Code so far:

(ns dirdiff.core
(:gen-class))

(defn deleted-files [prefix-file1 prefix-file2 file1 file2]
    (let [before (slurp "resources/davor.txt")
    (let [after (slurp "resources/danach.txt")
)

Expected output: which is the one who was deleted

/pictures/img2.jpeg

How can I filter the lists in clojure.clj to show only the missing ones?


Solution

  • You probably want to compute a set difference between the two sets of filenames after prefices have been removed:

    (defn deprefixing [prefix]
      (comp (filter #(clojure.string/starts-with? % prefix))
            (map #(subs % (count prefix)))))
    
    (defn load-string-set [xf filename]
      (->> filename
           slurp
           clojure.string/split-lines
           (into #{} xf)))
    
    (defn deleted-files [prefix-file1 prefix-file2 file1 file2]
      (clojure.set/difference (load-string-set (deprefixing prefix-file1) file1)
                              (load-string-set (deprefixing prefix-file2) file2)))
    
    (deleted-files "before" "after"
                   "/tmp/before.txt" "/tmp/after.txt")
    ;; => #{"/pictures/img2.jpeg"}