Search code examples
clojure

Clojure: How do you apply a function to values at a specific nesting level?


I'm a beginner at Clojure so I'll do my best to phrase this as well as I can,

I have a function that returns a list of nested lists after parsing a dataset of daily temperatures, each nested list corresponds to daily temps of a specific month e.g Feb 2014, Feb 2015 etc. and is padded out to 31 items using "-999" as filler to retain the dataset's structure.

raw dataset: https://www.metoffice.gov.uk/hadobs/hadcet/cetdl1772on.dat

(partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong")))


=>((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999)
 (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999)
 (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999)
 (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)
 (-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))

I'm trying to remove the -999 values from all nested lists in the list, I need to do this after partitioning the data to avoid having to partition the data arbitrarily by a number of days in each month. The closest I've got is below but it has no effect as it's only being applied to the top-level list instead of the values in each nested list, How would I need to modify this to get the result I'm looking for, Or to ask my original question; How do you apply a function to values at a specific nesting level?

(remove #(= -999 %)(partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong"))))

Below is the minimal code with a chunk of the results from my partitioning function, I think it's very close but if you can show me what I'm missing I would really appreciate it, Thanks

(remove #(= -999 %)'(((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999)
                     (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999)
                     (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999)
                     (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)
                     (-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))))

I've tried the below and loads of variations on it with map etc, but haven't got anywhere, Seeing a correct example would really help me understand where I'm going wrong.

(apply #(remove -999 %) (partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong"))))
Exception: Wrong number of args (21) passed 

Solution

  • So iiuc, the:

    • Overall list contains year lists, and the
    • Year lists contain month lists, and the
    • Month lists contain the temperatures for the days, and
    • The month lists are each padded w/ -999's to make them uniform in size: 31 entries long

    What I see that you've tried:

    • You've used the remove function w/ a predicate to remove if the value equals -999. The value in this case is '((-15 7 15 -25 -5 -45 12 ...)) which does not equal -999, so you end up w/ what you started with.
    • apply takes a function and a single sequence of args. You passed in 21 lists to apply.

    With all this probably understood, I think the easiest solution is a nested for loop. A for loop returns a list of your values, optionally modified by a function. Each value is a list, so you need to go deeper w/ another for loop.

    ; Remove -999's, three levels deep, with for.
    
    (defn remove-999s [s-of-s]
       ; All data
       (for [year s-of-s]
          ; For all years   
          (for [month year] 
             ; For all months
             ; (filter #(not (= % -999)) month) would also work
             (remove #(= % -999) month))))
    
    (remove-999s '(((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999) (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999) (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999) (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)(-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))))
    

    Here's the result, without the -999's.

    ; (((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56) 
    ; (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70) 
    ; (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81) 
    ; (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66) 
    ; (-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40))) [End of data]
    

    Because Clojure doesn't allow nested #'s, and nesting fn's gets gross, if you want to use maps like Biped suggests, you'll probably want to use it with letfn or defn. Here's how I did it:

    ; Remove -999's, three levels deep, with maps.
    
    (defn remove-999s [s-of-s]   
       (letfn [(is-999 [v] (= v -999))
               ( map-month [s] (remove is-999 s))
               ( map-year [s] (map map-month s)) ]
         (map map-year s-of-s))) ; Gives the same results.
    

    After writing this, I realized that for is like a weird map, so either can be used.

    Another alternative's loop and recur or otherwise classic recursion.