I have two arrays of floats, and want to calculate the weighted correlation, meaning that I want some of my data to have lower weight than others.
X Y w
2.02382 6.00298 0.43873
3.94601 6.41983 0.36818
3.76877 4.55656 0.49836
3.68307 6.46925 0.95965
3.09073 4.57723 0.88889
2.56690 2.70020 0.72812
3.35469 6.76874 0.26863
3.88722 5.23205 0.77492
3.29389 3.50355 0.79567
3.80725 3.18414 0.82439
So, I want correlation between X, and Y regarding the weights w. My problem is mainly a theory problem, but at the end I want to implement it in C.
The main idea is that whenever you see E(...) you replace 1/n with w/sum(w).
Theory:
Corr(X,Y) = E( (X - E(X))*(Y - E(Y) ) / SD(X)SD(Y) ;
So first calculate E(X) and E(Y).
E(X) = (2.02382 * .43873 + ... + 3.80725*.82439) / (.43873+...+.82439) = 3.368
E(Y) = [same weighted average idea] = 4.705
sd(X) = sqrt( var(X) ) = sqrt( E( (X-E(X))^2 ) ) = sqrt( ( (.43873)(2.02382-3.368)^2 + ... + (.82439)(3.80725-3.368)^2 ) / (.43873+...+.82439) ) = sqrt(0.3054023) = 0.5526321
sd(Y) = [same weighted average idea] = sqrt(1.860124) = 1.363863
corr(x,y) = ( (.43873)(2.02382-3.368)(6.00298-4.705)+...+(.82439)(3.80725-3.368)(3.18414-4.705) ) / ( (.43873+...+.82439)(.5526)(1.3634) ) = 0.2085651