Search code examples
spss

Transposing (CaseToVars) in SPSS and renaming Variables with Existing Variables


The dataset contains 6 variables such as:

StudentID  Sequence  Metric  Score  Interpretation  ScoreName
a123       8.00      13      540    12              P1_AIE
a123       9.00      14      550    19              P2_AIE
a123       9.00      15      500    13              P1_BP
a124       8.00      14      450    11              P2_AIE

I am ultimately trying to put all cases in one row such as:

StudentID P1_AIE_Seq  P1_AIE_Scr P1_AIE_Interp P2_AIE_Seq P2_AIE_Scr P2_AIE_Interp P1_BP_Seq etc.
a123       8.00       540        12            9.00       550        19            9.00      etc.
a124                                           8.00       450        11 

These are students (StudentID) who repeatedly complete a battery of tests (Sequence); their results (Score) are tracked by a numeric value (Metric) and a string (ScoreName), and a numeric code is attached for the interpretation for each score (Interpretation).

Metric and ScoreName always correspond (i.e., 14 = P2_AIE in all cases). There are no duplicate records. Sequence isn't initially appropriate to use as an index because it repeats within the ID variable, but to get around that I use:

SORT CASES  BY StudentID Sequence.
compute dup = 0.
if $casenum>1 and lag(StudentID)= StudentID and lag(sequence) = Sequence dup=lag(dup)+1.
sort cases by StudentID dup Sequence.
CASESTOVARS    
/ID = StudentID dup
/Index = Sequence
/GROUPBY = Index/sep="_".

I am obviously constructing the syntax incorrectly---this doesn't put a single case on a row, it won't rename variables by the ScoreName---not even sure that SPSS can do this without python.

Any feedback or advice would be greatly appreciated


Solution

  • In your example of the desired output it looks like you are using scorename as the index, in which case there doesn't seem to be any need to fiddle with the sequence variable as you did.

    This recreates your example data:

    data list list /StudentID (a4) Sequence  Metric  Score  Interpretation  (4f3) ScoreName (a6).
    begin data.
    "a123"       8      13      540    12    "P1_AIE"
    "a123"       9      14      550    19    "P2_AIE"
    "a123"       9      15      500    13    "P1_BP"
    "a124"       8      14      450    11    "P2_AIE"
    end data.
    

    Now if you run the following, you get all the data for each ID in one line:

    sort cases by studentID ScoreName.
    casestovars /id=studentID /index=ScoreName/sep="_"/groupby=index.