Search code examples
pivot-tablespss

Pivot variables in SPSS with `CASESTOVARS` and specify the resulting column names


We regularly receive a SPSS dataset with patients, eventnames and the dates when the questionnaires were filled-in. Each record is a patient ID and an event name and a date column for each questionnaire.

We want to pivot the data to a single record per patient, with columns for each questionnaire/eventname combination with the questionnaire dates. The resulting dataset will be used to more easily see which patients need reminders or where there are missing questionnaires, so administrative staff can send reminders, close cases etc. See example datasets below (I've copied it to Excel to add colors)

The original dataset

And the desired resulting dataset:

The desired resulting dataset

Here is syntax I've got so far, with this small example dataset (the real dataset is larger and has more columns).

data list free /PatientId (A7) Eventname (A10) RandGroup (A10) Question1 (DATE10) Question2 (DATE10).
begin data
    PAT1001 Baseline group_II 05-03-2022 ""
    PAT1001 month3 "" "" 10-06-2022
    PAT1001 month12 "" "" 12-03-2023
    PAT2002 Baseline group_IV 22-10-2022 ""
    PAT2002 month3 "" 21-02-2023 22-02-2023
    PAT3003 Baseline group_I 29-11-2021 07-12-2021
    PAT3003 month3 "" "" 27-02-2022
    PAT3003 month12 "" 01-12-2022 03-12-2022
end data.

SORT CASES BY PatientId, Eventname.

CASESTOVARS
    /ID = PatientId
    /INDEX = EventName
    /GROUPBY = VARIABLE.

This syntax works, but another requirement is that we want to specify the new column names. Instead of Question1.Baseline, Question2.month12 etc. we want the resulting column names to be Q1_BASE, Q2_M12 etc. I know I can rename the columns manually like this:

rename variables (Question1.Baseline = Q1_BASE).
rename variables (Question1.month3 = Q1_M3).
rename variables (Question1.month12 = Q1_M12).

However in the real dataset there are a lot more questionnaire columns, and also in later datasets the questionnaire columns could have different names. So I'm looking for maybe a more systemic/futureproof way to do this

So, is there a parameter in CASESTOVARS to also get the desired column names?


Solution

  • A much easier way to get the desired short names is to rename the variables BEFORE the casestovars. Also you can use the separator sub command to change the resulting names:

    recode EventName ("Baseline" = "BASE")("month3" = "M3")("month12" = "M12").
    rename variables (question1 to question70 = Q1 to Q70).
    CASESTOVARS
        /ID = PatientId
        /INDEX = EventName
        /GROUPBY = VARIABLE
        /separator="_".