I am trying to find the cosine similarity between 2 vectors (x,y Points) and I am making some silly error that I cannot nail down. Pardone me am a newbie and sorry if I am making a very simple error (which I very likely am).
Thanks for your help
public static double GetCosineSimilarity(List<Point> V1, List<Point> V2)
{
double sim = 0.0d;
int N = 0;
N = ((V2.Count < V1.Count)?V2.Count : V1.Count);
double dotX = 0.0d; double dotY = 0.0d;
double magX = 0.0d; double magY = 0.0d;
for (int n = 0; n < N; n++)
{
dotX += V1[n].X * V2[n].X;
dotY += V1[n].Y * V2[n].Y;
magX += Math.Pow(V1[n].X, 2);
magY += Math.Pow(V1[n].Y, 2);
}
return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY));
}
Edit: Apart from syntax, my question was also to do with the logical construct given I am dealing with Vectors of differing lengths. Also, how is the above generalizable to vectors of m dimensions. Thanks
If you are in 2-dimensions, then you can have vectors represented as (V1.X, V1.Y)
and (V2.X, V2.Y)
, then use
public static double GetCosineSimilarity(Point V1, Point V2) {
return (V1.X*V2.X + V1.Y*V2.Y)
/ ( Math.Sqrt( Math.Pow(V1.X,2)+Math.Pow(V1.Y,2))
Math.Sqrt( Math.Pow(V2.X,2)+Math.Pow(V2.Y,2))
);
}
If you are in higher dimensions then you can represent each vector as List<double>
. So, in 4-dimensions the first vector would have components V1 = (V1[0], V1[1], V1[2], V1[3])
.
public static double GetCosineSimilarity(List<double> V1, List<double> V2)
{
int N = 0;
N = ((V2.Count < V1.Count) ? V2.Count : V1.Count);
double dot = 0.0d;
double mag1 = 0.0d;
double mag2 = 0.0d;
for (int n = 0; n < N; n++)
{
dot += V1[n] * V2[n];
mag1 += Math.Pow(V1[n], 2);
mag2 += Math.Pow(V2[n], 2);
}
return dot / (Math.Sqrt(mag1) * Math.Sqrt(mag2));
}