Define three row vectors, A = (1,2,3) B = (10,20,30,40) C = (100,200,300,400,500)
I want to construct a new matrix D which will be a have 3x4x5 = 60 elements and contains the averages of these elements as illustrated below:
D =
(1+10+100)/3, (1+10+200)/3,…, (1+10+ 500)/3 \
(1+20+100)/3, (1+20+200)/3,…, (1+20+ 500)/3 \
(1+30+100)/3, (1+30+200)/3,…, (1+30+ 500)/3 \
(1+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \
(2+10+100)/3, (2+10+200)/3,…, (2+10+ 500)/3 \
(2+20+100)/3, (2+20+200)/3,…, (2+20+ 500)/3 \
(2+30+100)/3, (2+30+200)/3,…, (2+30+ 500)/3 \
(2+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \
(3+10+100)/3, (3+10+200)/3,…, (3+10+ 500)/3 \
(3+20+100)/3, (3+20+200)/3,…, (3+20+ 500)/3 \
(3+30+100)/3, (3+30+200)/3,…, (3+30+ 500)/3 \
(3+40+100)/3, (3+40+200)/3,…, (3+40+ 500)/3 \
The way it is set up in this example it will be a 12x5 matrix, but I am fine if it is a 1X60 vector or 60X1 vector.
How to do this efficiently in Mata? I am new to Mata and I had this running in Stata using multiple forval
loops (in this case, there would be 3 forval
loops). But this becomes very time consuming as I have up to 8 row vectors and about 120 elements in each of them.
I figured that I can use for
loops in Mata and it will be much faster, but I believe if I can do this as a matrix manipulation instead of using for
loops then it will be even faster. The problem is I am having a hard time visualizing how to write such a program (or if it's even possible) and any help would be highly appreciated.
The clever solution by @AspenChen offers huge speed gains over for
loops, as shown with some testing:
clear all
set more off
mata
timer_clear()
//----- change data -----
fa = 250
fb = fa + 1
fc = fa + 2
//----- Method 1 -----
timer_on(1)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
F = J(1, cols(A) * cols(B) * cols(C), .)
col = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
col++
F[1,col] = A[1,i] + B[1,j] + C[1,k]
}
}
}
timer_off(1)
//----- Method 2 (Aspen Chen) -----
timer_on(2)
A = (1::fa) // fa x 1
B = (1::fb)*10 // fb x 1
C = (1::fc)*100 // fc x 1
// tensor sum for A and B
a = J(1,rows(B),1) // 1 x fb with values of 1
b = J(1,rows(A),1) // 1 x fa with values of 1
T = (A#a) + (b#B)' // fa x fb
T = vec(T) // fa*fb x 1
// tensor sum for T and C
c = J(1,rows(T),1) // 1 x fa*fb with values of 1
t = J(1,rows(C),1) // 1 x fc with values of 1
T = (C#c) + (t#T)' // fc x fa*fb
timer_off(2)
timer()
end
Resulting in:
timer report
1. 8.78 / 1 = 8.776
2. .803 / 1 = .803
If the original poster still wants to use for
loops due the large number of elements that will be compared, (s)he can use something along the lines of:
<snip>
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)
}
}
}
larger
<snip>
Further tests with for
loops only:
clear all
set more off
mata
timer_clear()
//----- change data -----
fa = 500
fb = fa + 1
fc = fa + 2
//----- Method 1 -----
timer_on(1)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)
}
}
}
larger
timer_off(1)
//----- Method 2 (ec27) -----
timer_on(2)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
placebo = A[1,i] + B[1,j] + C[1,k]
if (placebo > 7) larger = larger + 1
}
}
}
larger
timer_off(2)
timer()
end