Search code examples
time-seriesinterpolationstata

Interpolating numeric values in Stata without creating new variables


I have a longitudinal data set with recurring observations (id 1,2,3...) per year. I have thousands of variables of all types. Some rows (indicated by a variable to_interpolate == 1) need to have their numeric variables linearly interpolated (they are empty) based on values of the same id from previous and next years.

Since I can't name all variables, I created a varlist of numeric variables. Also, I do not want to recreate thousands of extra variables, so I need to replace the existing missing values.

What I did so far:

quietly ds, has(type numeric)
local varlist `r(varlist)'

sort id year
foreach var of local varlist {
   by id: ipolate `var' year replace(`var') if to_interpolate==1
}

No matter what I do, I get an error message:

factor variables and time-series operators not allowed
r(101);

My questions:

  1. How is the 'replace' even proper syntax? if not, how to replace the existing variable values instead of creating new variables?
  2. If the error means that factors exist in my varlist - how to detect them?
  3. If not, how to get around this?

Thanks!


Solution

  • As @William Lisowski underlines, there is no replace() option to `ipolate'. Whatever is not allowed by its syntax diagram is forbidden. In any case, keeping a copy of the original is surely to be commended as part of an audit trail.

    sort id 
    quietly ds, has(type numeric)
    
    foreach var in `r(varlist)' {
       by id: ipolate `var' year, gen(`var'2) 
    }