Search code examples
spss

How to load CSV files into SPSS Variable and Value Labels


Summary

Let me preface this by saying I'm new to SPSS so I apologize if my terminology is incorrect. I have two CSV files about the same survey (one with the 'Variable Labels' and one with the 'Value Labels'. I want to combine these without having to manually code through each syntax (if possible).

1 - CSV with Value Labels

respondent_id,     I_am_between,  I_am_happy
3470220950,     26-33 years old,  Sometimes
3470226804,     34-41 years old,  Very Often
3470226906,     34-41 years old,  Sometimes

2 - CSV with Values

respondent_id,    I_am_between,     I_am_happy
3470220950,                  2,     3
3470226804,                  3,     4
3470226906,                  3,     3

What I'm looking to do is match the question "I_am_between" variable label of '26-33 years old' to the value of '2'. Is this possible in SPSS (and if so, how)? Thanks.

Update to Jay's solution and comment: As mentioned in Jay's post, the first method might not load the answer in an order that you like if you want to keep rank/order. For example, a question 'I_have_been_with_the_company' might load the following: (1='<2 years', 2='>10 years', 3='3-5 years') when instead you would want (1='<2 years', 2='3-5 years', etc.) I fixed this by loading the second file (that shows values) and manually editing the labels.

VALUE LABELS
I_have_been_with_the_company
1 '<2 years'
2 '3-5 years'
3 '5-7 years'
4 '8- 10 years'
5 '>10 years'.

EXECUTE.

Solution

  • The easiest way to do this is to import the first file only and use automatic recode. This has the advantage of being straightforward but the disadvantage that the recoded values may not necessarily match up with the values in file 2.

    GET DATA  /TYPE=TXT
      /FILE="file1.csv"
      /ENCODING='UTF8'
      /DELCASE=LINE
      /DELIMITERS=","
      /ARRANGEMENT=DELIMITED
      /FIRSTCASE=2
      /IMPORTCASE=ALL
      /VARIABLES=
      respondent_id F10.0
      V2 A15
      V3 A10.
    CACHE.
    
    AUTORECODE VARIABLES=V2 V3 
      /INTO I_am_between I_am_happy.
    
    DELETE VARIABLES V2 V3.
    

    Alternatively, a second approach would be to import both files into separate data files, merge them using add variables, then use the STATS VALLBLS FROMDATA extension command (which you'll need to install) to apply the values of one variable as labels to another variable.

    GET DATA  /TYPE=TXT
      /FILE="file2.csv"
      /ENCODING='Locale'
      /DELCASE=LINE
      /DELIMITERS=","
      /ARRANGEMENT=DELIMITED
      /FIRSTCASE=2
      /IMPORTCASE=ALL
      /VARIABLES=
      respondent_id F10.0
      I_am_between F2
      I_am_happy F2.
    CACHE.
    
    DATASET NAME DataSet1 WINDOW=FRONT.
    
    GET DATA  /TYPE=TXT
      /FILE="file1.csv"
      /ENCODING='UTF8'
      /DELCASE=LINE
      /DELIMITERS=","
      /ARRANGEMENT=DELIMITED
      /FIRSTCASE=2
      /IMPORTCASE=ALL
      /VARIABLES=
      respondent_id F10.0
      V2 A15
      V3 A10.
    CACHE.
    
    DATASET NAME DataSet2 WINDOW=FRONT.
    
    STAR JOIN 
      /SELECT t0.V2, t0.V3, t1.I_am_between, t1.I_am_happy 
      /FROM * AS t0 
      /JOIN 'DataSet1' AS t1 
        ON t0.respondent_id=t1.respondent_id 
      /OUTFILE FILE=*.
    
    STATS VALLBLS FROMDATA VARIABLES=I_am_between I_am_happy LBLVARS=V2 V3 
    /OPTIONS VARSPERPASS=20  
    /OUTPUT EXECUTE=YES.
    
    DELETE VARIABLES V2 V3.