Search code examples
pythonrspss

How to extract enumerated labels and corresponding numerical values from a .sav file?


How can I extract a mapping of numbers to labels from a .sav file without access to SPSS?

I am working with a non-profit who uses SPSS, but I don't have access to SPSS (and they are not technical). They've sent me some SPSS files, and I was able to extract these into csv files which have correct information with an R package called foreign.

However for some files the R package extracts textual labels and for other files the R package extracts numbers. The files are for parallel case studies of different individuals, and when I count the labels vs. the numbers they don't even match exactly (say 15 labels vs. 18 enums because the underlying records were made across many years and by different personnel, so I assume the labels probably don't match in any case). So I really need to see the number to label matching in the underlying enum. How can I do this without access to SPSS?

(p.s. I also tried using scipy.io to read the .sav file and got the error Exception: Invalid SIGNATURE: b'$F' when testing on multiple files before giving up so that seems like a non-starter)


Solution

  • For R, you can perhaps use the haven package. Of the course the results will depend on the files being imported, but the package does included functions for dealing with/viewing labels (presuming the labels actually exist).