Search code examples
statatransformation

Transformation of Non-Continuous Height Variable


I have trouble efficiently transforming a wrongly coded height variable in my sample.

The variable is coded as follows:

     Height 
ID1    601   
ID1    601   
ID1    601   
ID3    409   
ID3    410   
ID4    511
.      .
.      .
.      .
ID100  400

As you can see, the variable is coded as feet and inch simultaneously where 601 equals 6 foot 1 inch, 511 equals 5 ft 11 inches etc.

My goal is to transform these numbers into inches:

replace Height = 48 if Height == 400 
replace Height = 49 if Height == 401
replace Height = 50 if Height == 402
replace Height = 51 if Height == 403
.
.
.
replace Height = 83 if Height == 611

How do I efficiently code this with a loop?


Solution

  • The code below assumes that each observation of Height contains exactly three digits:

    tostring Height, generate(Height_string)
        /* Generate a new variable which is a string-version of Height
           (so that we can get the individual digits) */
    
    generate feet = substr(Height_string, 1, 1)
        /* From the first character in the string, select one character */
    
    generate inch = substr(Height_string, 2, 2)
        /* From the second character in the string, select two characters.
           An equivalent alternative would have been
           generate inch = substr(Height_string, -2, 2)
           which from the second to last character selects two characters */
    
    destring feet inch, replace
        /* Convert these two new variables to numeric */
    
    generate tot_inch = feet * 12 + inch
        /* Generate a new variable which measure only in inches. */