Search code examples
phppythonsimilarity

Calculating similarity -> python code to php code - Whats wrong?


I am trying to convert the following python code to a PHP code. Can you please explain me what is wrong in my PHP code, because I do not get the same results. If you need example data please let me know.

# Returns a distance-based similarity score for person1 and person2

def sim_distance(prefs,person1,person2):
    # Get the list of shared_items 
    si={} 
    for item in prefs[person1]:
        if item in prefs[person2]: si[item]=1
    # if they have no ratings in common, return 0 
    if len(si)==0: return 0

    # Add up the squares of all the differences 

    sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2)
        for item in prefs[person1] if item in prefs[person2]]) 

    return 1/(1+sum_of_squares)

My PHP code:

$sum  = 0.0;

foreach($arr[$person1] as $item => $val)
{
    if(array_key_exists($item, $arr[$person2]))
    {
        $p = sqrt(pow($arr[$person1][$item] - $arr[$person2][$item], 2));
        $sum = $sum + $p;
    }
}


$sum = 1 / (1 + $sum);

echo $sum;

Thanks for helping!


Solution

  • this is close as i could make a direct translation... (untested)

    function sim_distance($prefs, $person1, $person2) {
        $si = array();
        foreach($prefs[$person1] as $item) {
            if($item in $prefs[$person2]) $si[$item]=1;
        }
        if(count($si)==0) return 0;
    
        $squares = array();
        foreach($prefs[$person1] as $item) {
            if(array_key_exists($item,$prefs[$person2])) {
                $squares[] = pow($prefs[$person1][$item]-$prefs[$person2][$item],2);
            }
        }
        $sum_of_squares = array_sum($squares);
        return 1/(1+$sum_of_squares);
    }
    

    I don't really know what you're trying to do, or if I've interpreted the indentation correctly...but maybe this'll help. I'm assuming your data structures have the same layout as in the python script.

    oh...and i'm interpreting the python as this:

    def sim_distance(prefs,person1,person2):
        # Get the list of shared_items 
        si={} 
        for item in prefs[person1]:
            if item in prefs[person2]: si[item]=1
    
        # if they have no ratings in common, return 0 
        if len(si)==0: return 0
    
        # Add up the squares of all the differences
        sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2) for item in prefs[person1] if item in prefs[person2]]) 
    
        return 1/(1+sum_of_squares)