Search code examples
spss

Create a new dataset with one case for each value of a variable in the original dataset


I have a dataset where each case is a student and I have a variable for sex (SEX), as well as one for major (MAJOR). The variable for sex has 2 possible values (male and female), whereas the one for major has dozens (biology, mathematics, etc.).

I would like to use that dataset to create another dataset with one case for each major and 3 variables: MAJOR, MALE and FEMALE. The value of the variable MALE for each major should be the number of men enrolled in that major and the value of the variable FEMALE should be the number of women enrolled in it. The value of MAJOR should just be the label of the value of the variable MAJOR in the original dataset corresponding to that case.

Just so it's clear, when I look at the dataset I would like to create, there should be one line per major, with one column MAJOR that contains the label of each major, one for MALE that contains the number of men enrolled in each major and one column for FEMALE that contains the number of women enrolled in each major.

The dataset I have was created with SPSS and I have never used that program, so I have no idea how to do that, even though it's probably very easy. I would be very grateful for your help!

Best, Philippe


Solution

  • When your file is open, open a new syntax window, put the following code in it and run it:

    dataset name OrigFile.
    compute male=(SEX="MALE").
    compute female=(SEX="FEMALE").
    dataset declare NewFile.
    aggregate /outfile='NewFile' /break=major /male female=sum(male female).
    

    after running this you will have two open datasets - you original one and the new one you wanted to create.