Search code examples
luacomparisonhistogramdistancetorch

Earth movers distance in torch/lua (or how to use a criterion to just obtain a comparison)


I'm trying to calculate the distance between two histograms in torch7, in order to do this I was thinking about using the earth mover's distance. Now I know it's not that hard to do this in python using something like https://github.com/garydoranjr/pyemd however I have my data in torch and need to execute this computation many times. As such moving the entire data between torch7 and python is not an option.

So my question is what is the fastest earth mover distance calculator in torch7? I have searched but could not find anything like a library and was hoping there is some better way to implement this then line by line translation of python code especially seeing as how torch is often better at handling things on the gpu.

Edit I have found this but am not sure how to use it.

I currently have the following code:

    function ColourCompareHistEMD (imagers)
        sumdistance=0
        k={}
        for i=1,$images do 
            k[i]=torch.bhistc(images[i],20,-100,100)
        end

        for i=1,$images do 
           for j=1,$images do 
                #what to do here? 
           end
        end
    end


My current best guess is something like this:

function ColourCompareHistEMD (images)
    sumdistance=0
    r={}
    for i=1,#images do 
        print(images[i])

        r[i]=torch.histc(images[i][1]:view(images[i][1]:nElement()),20,-100,100)
    end

    for i=1,#images do 
       for j=1,#images do 
            criterion = nn.EMDCriterion()
            criterion:forward(r[i],r[j])
            sumdistance=sumdistance+criterion.loss          

       end
   end

return sumdistance
end 

but that doesn't seem to work as criterion.loss isn't working and it gives me an error

/home/thijser/torch/install/bin/luajit: bad argument #2 to '?' (out of range at /home/thijser/torch/pkg/torch/generic/Tensor.c:704)
stack traceback:
    [C]: at 0x7f2048fdc530
    [C]: in function '__newindex'
    /home/thijser/torch/install/share/lua/5.1/EMDCriterion.lua:52: in function 'preprocess'
    /home/thijser/torch/install/share/lua/5.1/EMDCriterion.lua:255: in function 'forward'
    imageSelector.lua:343: in function 'evalHueImages'
    imageSelector.lua:66: in function 'evaluate'
    imageSelector.lua:81: in function 'SelectTop'
    imageSelector.lua:151: in function 'evolve'
    imageSelector.lua:158: in function <imageSelector.lua:156>
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5641c3f40470

But am not sure how to use it such that in the comment the earth mover distance between image i and j is calculated.


Solution

  • It appears that EMDCriterion expects the input and target to be at least 2-dimensional. It also expects the points in your comparison to be laid out horizontally. Since the result of torch.histc is 1-dimensional, you can reshape it into 2-dimensional row tensor like so:

    for i=1,#images do 
        print(images[i])
        local hist = torch.histc(images[i][1]:view(images[i][1]:nElement()),20,-100,100)
        r[i] = hist:reshape(1,hist:nElement())
    end
    

    Additionally, I tried running the code

    criterion:forward(r[i],r[j])
    print(criterion.loss)
    

    and the result was nil. Try this instead for accumulating the losses:

    local loss = criterion:forward(r[i],r[j])
    sumdistance = sumdistance + loss
    

    Also, it'll be a bit more efficient if you define the criterion criterion = nn.EMDCriterion() outside of the nested for-loop.