In Stata, I have the following variables: latitude
, longitude
, avg_luminosity
. For each observation (1547 total), I need to find a sum (let's call this variable sum_lum
) of average luminosities of "neighbours" of this particular pair of latitude
and longitude
, those whose latitude
and longitude
lie within 0.5 radius. I have tried the following code:
tempvar sum_temp
forvalues i=1/1547 {
egen `sum_temp' = sum(avg_luminosity) if (latitude<latitude[_n]+0.5 & latitude>latitude[_n]-0.5 & longitude<longitude[_n]+0.5 & longitude>longitude[_n]+0.5)
replace sum_lum[_n]= sum_temp
drop `sum_temp'
}
But the code doesn't work (weights not allowed
). Could anyone please help me on this issue?
We don't here have a very good question, as no sample data are given with which to run the code. See https://stackoverflow.com/help/mcve for how to ask a good question. We have that 1547 is the number of observations.
But nevertheless there are various problems identifiable with this code.
First, consider the if
qualifier:
if (latitude<latitude[_n]+0.5 & latitude>latitude[_n]-0.5 & longitude<longitude[_n]+0.5 & longitude>longitude[_n]+0.5)
We need to correct a typo there: the last +0.5
should evidently be -0.5
.
To focus on the main problem, replace latitude
with y
and longitude
with x
if (y < y[_n]+0.5 & y > y[_n]-0.5 & x < x[_n]+0.5 & x > x[_n]-0.5)
The subscript [_n]
just means the current observation and is superfluous:
if (y < y+0.5 & y > y-0.5 & x < x+0.5 & x > x-0.5)
from which it can be seen that the qualification is no qualification: it is always true that (using mathematical notation now) y - 0.5 < y < y + 0.5 and similarly for x.
The intent of this code is to compare any y
and any x
with the current y
and x
, but that is not what it does in Stata.
Otherwise put, the guess may be that [_n]
has a different interpretation each time round a loop, but that is not the case.
Second, the effect of the loop 1/1547
would, if the code were otherwise correct, would be to repeat exactly the same calculation 1547 times. The intent of the code is no doubt otherwise, but nothing inside the loop uses the loop index i
in any way.
Third, neither of these is the problem reported.
replace sum_lum[_n]= sum_temp
fails because of the subscript, which is not allowed with replace
before the equals sign: the error message about weights is Stata's guess that you are trying to specify weights. The statement would also fail (to do what you want, or very likely to work at all), because the variable on the right-side should be the temporary variable you have just created.
Fourth, although this is style not syntax, using egen
to calculate a sum is overkill. No new variable need be re-created 1547 times only to be drop
pred.
Here's a guess at what will work:
gen sum_lum = .
local y latitude
local x longitude
quietly forval i = 1/1547 {
summarize avg_luminosity if inrange(`y', `y'[`i'] - 0.5, `y'[`i'] + 0.5) & ///
inrange(`x', `x'[`i'] - 0.5, `x'[`i'] + 0.5), meanonly
replace sum_lum = r(sum) in `i'
}
That loop uses the current observation's latitude and longitude.