Search code examples
stringintegerstatatostring

tostring turns character values into integers


I am trying to convert a column in Stata (AP1) in the photo below so that the entries are string types. Currently the entries appear as characters ("cultivateur" for example) but are shown as being of the type int. enter image description here

I used the following code to try and change them to strings.


label values AP1 .

tostring AP1, replace
AP1 was long now str5


While the AP1 column becomes a string type all the characters now become integers which is not what I need in order to subset the data based on observations. enter image description here

Does anyone know I can switch this column to the string type without the characters becoming integers?


Solution

  • Your images are barely readable -- on a phone or laptop -- but perhaps readable by anyone using a very large monitor. Please see the Stata tag wiki for guidance on presenting reproducible data examples.

    What is going on can be explained reproducibly by considering

    . sysuse auto, clear
    (1978 automobile data)
    
    . tab foreign
    
     Car origin |      Freq.     Percent        Cum.
    ------------+-----------------------------------
       Domestic |         52       70.27       70.27
        Foreign |         22       29.73      100.00
    ------------+-----------------------------------
          Total |         74      100.00
    
    . tab foreign, nolabel
    
     Car origin |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         52       70.27       70.27
              1 |         22       29.73      100.00
    ------------+-----------------------------------
          Total |         74      100.00
    
    . tostring foreign, gen(str_foreign)
    str_foreign generated as str1
    
    . tab str_foreign
    
     Car origin |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         52       70.27       70.27
              1 |         22       29.73      100.00
    ------------+-----------------------------------
          Total |         74      100.00
    
    . d *foreign
    
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ------------------------------------------------------------------------------------------------------------------
    foreign         byte    %8.0g      origin     Car origin
    str_foreign     str1    %9s                   Car origin
    

    foreign like your problematic variable is a numeric variable with value labels. (The term "column" is not standard in Stata for variables in the dataset.) Push it through tostring and you get a string variable containing integer characters. Stata did what you asked.

    To get a string variable containing the text of the value labels, you need to apply the decode command, which was written for precisely this purpose (and incidentally, long predates tostring as an official command).