Search code examples
groovy

Groovy elegant solution for avg array


str = """
0.6266197101540969 , 0.21279720289932263 , 0.7888811159800816 , 0.3374125902260934 , 0.8299999833106995 , 0.4300000071525574 , 0.6000000238418579 , 0.1599999964237213
0.6286013903734567 , 0.21750000088165203 , 0.780979019361776 , 0.33202797309918836 , 0.8299999833106995 , 0.4300000071525574 , 0.6000000238418579 , 0.1599999964237213
0.6211805507126782 , 0.20285714375121253 , 0.7563043448372163 , 0.32666666932562566 , 0.8299999833106995 , 0.4300000071525574 , 0.6000000238418579 , 0.1599999964237213
0.5912230165956689 , 0.20713235388564713 , 0.7197058783734546 , 0.31335821047202866 , 0.8299999833106995 , 0.4300000071525574 , 0.6000000238418579 , 0.1599999964237213
0.6073239363834891 , 0.21000000010145473 , 0.7486428560955184 , 0.3176595757827691 , 0.8070422478125129 , 0.41928571494562283 , 0.5272857129573822 , 0.18029411882162094
0.6049999973513711 , 0.204892086360952 , 0.7618382347418982 , 0.3195035475577023 , 0.8021985744753628 , 0.44760563193072733 , 0.5268794330933415 , 0.18000000185320075
0.6292380872226897 , 0.1993396225965248 , 0.7613461521955637 , 0.3325471729040146 , 0.8194392418192926 , 0.4333644839369248 , 0.5276415085174003 , 0.180841121927043
"""

i need to count num of items in row, create array for it, then the array should have the sum of each colounm

end result:

//no saying this is the actualt avg of the colunm just pointing out...
sumArray = [0.62,0.41,0.61, ....]

averageTests = returnAvg(data)
println(averageTests)


def returnAvg(tmpData){
    tmpData = tmpData.replaceAll("( )+", "")

    def avgs = []//will hold array of avg 
    
    numOfDaysTest=tmpData.split("\n").size()
    
    for (line in tmpData.split("\n")){
        index=0
        for (value in line.split(",")){
            if (avgs[index] == null){
                avgs[index] = 0
            }
        tmpnum = value as Double
        avgs[index] += tmpnum                 
        index++
        }
    }
    
     
    for (i=0; i<avgs.size(); i++){
        tmp = (avgs[i]/numOfDaysTest).toString()
        avgs[i] = tmp.substring(0,tmp.indexOf('.')+3)
    }
    
    return avgs
}


the end result is ok but im sure theres a much more elegant way?


Solution

  • This one-liner must certainly be possible to further simplify, but I haven't managed to come up with one that's readable. I'm sure you can improve it.

    def means = str.trim().split('\n')*.split(',').collect{it*.trim().collect{e -> new BigDecimal(e)}.withIndex()}.collect{e -> e.collect{ee->[(ee[1]): [ee[0]]].entrySet()}}.flatten().groupBy{it.key}.collectEntries{key, val -> [(key):val*.value.flatten().average()]}
    

    With your test input, it produces

    [0:0.6155980983990644, 1:0.20778834435382369, 2:0.7596710859407870, 
     3:0.32559653419534601, 4:0.8212399996214238, 5:0.43146512277478637, 
     6:0.5688295357050794, 7:0.16873360404239284]
    

    Where the keys are the column indices and the values are the corresponding means.