Search code examples
clojureclojure-core

Matching strings in two text files and filtering in Clojure


My question is almost answered at the below link. However, I have a slight addition to the below question (Please check the "Similar Question").

Similar Question

My question is that:

If an entire folder was deleted, only this folder should be listed in the output (the content of this folder should not be listed additionally). As an example:

  • before/Fotos/Nope-2018/globe

  • before/Fotos/Nope-2018/globe/dc-40.jpg

  • before/Fotos/Nope-2018/globe/dc-41.jpg

  • before/Fotos/Nope-2018/globe/dc-42.jpg

  • before/Fotos/Nope-2018/globe/dc-43.jpg

If all the files starting with /globe (including /globe) is deleted, the output should not contain all the items but only "globe". How to achieve this as an addition to achieving the previously asked question in the given the link?

Any help is appreciated.


Solution

  • I'm making a few assumptions:

    1. The output of your "Similar Question" includes the folder (aka directory) name if and only if all its files/sub-folders have been deleted.
    2. The file paths are delimited by a slash.

    You can take the output of the "Similar Question" and post-process it such that no element of the resultant collection is a prefix substring of any other element.

    The following shows one way of accomplishing that.

    user> (def deleted-files-raw
            ["Fotos/Nope-2018/globe"
             "Fotos/Nope-2018/glob"
             "Fotos/Nope-2018/globe/asia"
             "Fotos/Nope-2018/globe/asia/dc-40.jpg"
             "Fotos/Nope-2018/globe/dc-40.jpg"
             "Fotos/Nope-2018/world/dc-40.jpg"
             "Fotos/Nope-2018/globe/dc-41.jpg"])
    #'user/deleted-files-raw
    user> (defn remove-files-deleted-dirs
            "`coll` is a collection of /-separated pathnames.
      Returns a collection of pathnames such that none of the pathnames
      are prefix substrings of any other pathname,
      assuming a / as the separator."
            [coll]
            (reduce (fn [acc x]
                      (if (clojure.string/starts-with? x (str (last acc) "/"))
                        acc
                        (conj acc x)))
                    []
                    (sort coll)))
    #'user/remove-files-deleted-dirs
    user> (remove-files-deleted-dirs deleted-files-raw)
    ["Fotos/Nope-2018/glob"
     "Fotos/Nope-2018/globe"
     "Fotos/Nope-2018/world/dc-40.jpg"]
    user>